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ABSTRACT 

Over  the  past  decade  logic  programming  has  received  considerable  attention.  This  is  partly  due  to 
the  fact  that  the  declarative  nature  of  Prolog  — a  language  that  derives  directly  from  a  subset  of  first- 
order  logic —  allows  clear,  concise,  human-oriented  representation  of  programs  and  partly  to  the  fact 
that  parallelism  can  be  identified  easier  on  logic  programs  compared  to  imperative  style  programs. 
This  latter  reason  has  motivated  a  substantial  research  efTort,  whose  objectives  can  be  divided  into 
two  distinct  directions.  The  first  direction  attempts  to  extract  the  parallelism,  inherent  in  first-order 
logic,  out  of  regular  Prolog  programs,  so  that  they  can  gain  a  significant  speed-up  factor.  The  second, 
more  ambitious,  tries  to  define  a  new  parallel  language,  founded  on  logic,  where  not  only  is  parallelism 
made  more  explicit  and  controllable  by  the  user,  but  where  computations  in  concurrent  systems  can 
be  naturally  expressed  too.  This  report  concentrates  on  this  second  direction;  it  presents  the  various 
proposals,  in  the  order  they  were  introduced,  by  identifying  their  design  motivation,  elaborating  on  their 
powerful  features,  commenting  on  their  weak  points  and  assessing  their  overall  influence.  Examples  are 
presented,  which  capture  the  flavor  of  each  proposed  language  — both  semantically  and  syntactically — 
but  also  guide  the  reader  through  a  number  of  interesting  paradigms  for  parallel  programming  in  logic. 
The  development  of  parallel  logic  programming  languages,  although  led  by  different  research  groups 
at  different  places,  has  had  direct  mutual  influence,  as  strengths  and  weaknesses  of  one  proposal  have 
prompted  changes  to  another.  This  report  does  not  attempt  to  be  exhaustive  but  rather  to  provide  a 
deep  understanding  of  the  concepts,  problem-areas,  proposed  solutions  and  trade-offs  associated  with 
the  design  of  such  languages.  It  is  felt  that  such  understanding  will  be  the  guide  to  designing  better 
languages  for  parallel  logic  programming  in  the  future.  Although  familiarity  with  logic  programming 
and  parallel  programming  would  be  helpful,  it  is  not  strictly  necessary,  as  an  effort  was  made  to  keep 
the  presentation  sufficiently  self-contained. 

Keywords:  logic  programming,  parallel  programming,  parallel  logic  programming  languages, 
Concurrent  Prolog,  Parlog,  Guarded  Horn  Clauses. 
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1      Introduction 

1.1  Switching  to  Parallelism 

The  leaping  progress  in  micro-electronics  and  VLSI  technology  has  reduced  the  cost  of  hard- 
ware to  a  degree  that  the  design  and  development  of  novel  parallel  computer  architectures  has 
become  a  reality.  Over  the  last  five  years,  experimental  multiprocessor  computer  systems  have 
emerged  in  research  laboratories  and  academic  institutions  and  are  slowly  but  steadily  enter- 
ing the  industry.  Although  such  computers  offer  significantly  superior  processing  capabilities 
when  compared  with  conventional  sequential  systems,  the  overall  improvement  obtained  did 
not  meet  the  general  expectations.  Part  of  the  blame  is  attributed  to  the  nature  of  the  architec- 
tures themselves;  a  major  disappointment,  however,  is  directly  associated  with  programming 
such  computers. 

Over  the  long  reign  of  sequential  "von  Neumann"  computer  systems,  a  number  of  lan- 
guages have  been  proposed  and  termed  high-level  as  they  provided  a  substantial  level  of 
abstraction  over  the  respective  machine  languages.  Yet,  these  languages  share  some  funda- 
mental characteristics,  which  are  linked  to  the  nature  of  the  computers  on  which  they  run, 
rather  than  with  the  thinking  process  of  the  humans  that  program  them.  Even  worse,  some 
of  these  common  characteristics,  such  as  programming  with  the  concept  of  the  state,  were  an 
immediate  result  of  the  sequential  nature  of  the  underlying  machine.  No  wonder  that  the  tran- 
sition to  parallel  computers  had  to  face  the  transition  to  different  programming  methodology 
and  languages.  As  early  as  1977,  Backus  realized  the  limited  expressive  power,  complexity 
and  inflexibility  of  conventional  von-Neumann  programming  languages.  In  his  Turing  Award 
lecture  [Bac78],  he  gave  an  extensive  account  of  his  dissatisfaction  with  such  languages  and 
proposed  as  a  cure  the  syntactic  and  semantic  elegance  of  applicative  languages. 

Applicative  are  denominated  those  languages  whose  statements  enjoy  a  declarative  inter- 
pretation and  are  completely  machine-independent  i.e.  contain  no  machine-level  side-effects. 
As  a  consequence,  no  reference  to  the  machine's  behavior  is  required  when  describing  a  prob- 
lem. The  declarative  interpretation,  on  the  other  hand,  makes  applicative  programs  more 
human-oriented  and  amenable  to  mathematical  treatment.  Such  programs  can  often  be  re- 
garded as  "runnable  specifications".  Although  applicative  languages  had  been  proposed  since 
the  early  years  of  computing,  their  conceptual  difference  from  machine  architectures  prevented 
them  from  becoming  widely  accepted.  It  is  only  recently  that  they  have  attracted  worldwide 
effort  in  response  to  the  challenge  of  parallelism.  It  appears  that  such  languages  offer  a  larger 
potential  for  parallelism.  Among  other  advantages,  lies  the  fact  that  programmers  do  not 
need  to  assume  the  responsibility  of  identifying  code  fragments  that  can  run  in  parallel:  the 
languages  themselves  contain  operations  that  can  be  inherently  performed  in  parallel. 

1.2  Logic  Programming  and  Prolog 

The  use  of  recursive  functions  for  programming  in  McCarthy's  Lisp  and  the  use  of  functional 
in  Backus'  FP,  are  two  clear  examples  of  the  applicative  approach.  An  even  more  human- 
oriented  direction  is  given  by  the  use  of  first-order  logic,  advocated  by  logic  programming. 
Prolog,  the  most  widely  accepted  logic  programming  language,  is  based  on  a  subset  of  first- 
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order  logic  known  as  definite  or  Horn  clauses  that  take  the  form 

H  if  Bi  and  B2  and  . . .  and  Bn 

where  H,  B\,B2,  ■  ■  ■  ,Bn  called  literals,  are  all  terms  of  predicate  calculus,  H  being  the  head 
and  B\  and  B2  and  . . .  and  Bn  the  body.  Prolog  programs  consist  of  a  finite  set  of  Horn 
clauses  augmented  with  the  following  computational  mechanisms: 

•  Resolution  for  substituting  the  head  of  a  goal  by  its  body  in  an  attempt  to  obtain  a 
successful  derivation. 

•  Unification  for  selecting  an  appropriate  clause  to  resolve  into. 

•  Backtracking  for  trying  alternative  ways  to  obtain  a  successful  derivation. 

Each  resolution  step  implicitly  contains  two  forms  of  nondeterminism:  which  literal  to 
choose  first  when  resolving  a  conjunction  of  goals  and  which  of  two  or  more  alternative  clauses 
for  a  goal  to  consider  first.  Unlike  Horn  clauses,  Prolog  is  equipped  with  two  standard  strate- 
gies which  eliminate  the  above  forms  of  nondeterminism: 

•  literals  in  a  conjunction  are  resolved  left-to-right  (computation  rule), 

•  clauses  in  a  disjunction  are  chosen  in  the  order  they  appear  textually  (search  rule). 
In  addition,  backtracking  is  performed  in  a  strictly  chronological  fashion. 

Prolog  programs  can  be  understood  as  a  set  of  descriptive  statements  about  a  problem.  In 
addition  to  this  declarative  semantics  inherited  from  logic,  Prolog  programs  can  be  described 
in  a  more  conventional  fashion  as  a  series  of  steps  passed  through  during  program  execution. 
This  different  way  of  understanding  the  semantics  is  called  the  procedural  interpretation  of 
logic  programs. 

As  an  example  of  a  Prolog  program  we  show  the  clauses  required  for  partitioning  a  list 
of  elements  L  into  two  sublists  Ll  and  L2,  each  containing  elements  less  than  or  equal  and 
greater  than  A'  respectively.    The  program  appears  in  figure  1  and  will  serve  as  a  basis  for 

partition{X, [L\Ls],[L\Xs],Ys)  :-  L  =<  X,  partition(A',Is,As,ys). 
partition(X, [L\Ls],Xs,[L\YsJ)  :-  L  >  X,  partition(A,Is,As,ys). 
partition(A',/7,/7,/7). 

?-  partition(X, L,L1,L2). 

Figure  1:  List  partitioning  in  Prolog 

comparison  among  various  proposals  in  later  sections  of  this  report.  In  the  first  clause,  the 
head  element  of  the  input  list  is  found  less  than  or  equal  to  A'  and  is  thus  entered  as  the  head 
of  list  Ll.  In  the  second  clause,  it  is  found  greater  than  X  and  is  entered  at  the  head  of  list 
L2.  Finally,  if  the  input  list  is  exhausted,  Ll  and  L2  are  closed  by  binding  their  tails  to  the 
empty  list. 

In  practice,  Prolog  systems  include  a  variety  of  built-in  predicates  assisting  program  de- 
velopment, many  of  which  defined  outside  first-order  logic,  hence  termed  extralogical.  Typical 
examples  of  such  predicates  are  the  cut  ("!")  for  controlling  the  search  space,  bagof  for 
collecting  multiple  solutions  of  a  goal  into  a  list  and  assert  and  retract  for  explicitly  adding 
and  removing  clauses  from  the  program  base. 
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1.3      Sources  of  Parallelism  in  Horn  Clauses 

In  contrast  to  Prolog,  Horn  clauses  do  not  dictate  any  particular  execution  strategy  during 
resolution  and  thus  offer  a  broader  scope  for  exploiting  available  parallelism.  All  calls  in  a 
conjunction  can  be  evaluated  concurrently  resulting  in  a  form  of  parallelism  known  as  AND- 
parallelism.  In  addition,  all  alternative  clauses  for  the  same  predicate  can  be  tried  in  parallel 
resulting  in  a  form  called  OR- parallelism.  Finally,  at  a  considerably  lower  level  than  these 
two  is  unification  parallelism  whose  importance  is  however  negligible  comparing  to  AND-  and 
OR-  parallelism  and  whose  effect  limited  by  theoretical  results. 

Various  special  cases  of  AND-parallelism  can  be  considered,  such  as  stream  parallelism, 
where  the  value  of  a  common  variable  between  two  concurrent  calls  is  communicated  incre- 
mentally, join  parallelism  and  all-solutions  parallelism  where  calls  in  a  conjunction  are  working 
in  parallel  on  different  solutions,  etc.  Similarly  OR- parallelism  can  be  shallow — only  unifica- 
tion of  alternatives  is  performed  in  parallel —  deep  — the  bodies  of  successful  clauses  are  also 
executed  in  parallel —  and  so  on.  An  extensive  account  of  all  forms  of  parallelism  in  the  logic 
programming  framework  accompanied  by  various  proposals  in  each  category  is  given  by  Syre 
and  Westphal  in  [SW85].  Many  of  the  forms  above  will  also  become  clearer  in  the  sections  to 
follow. 


2      Origins  of  Parallel  Logic  Programming  Languages 

The  origins  of  parallel  logic  programming  languages  can  be  traced  back  in  a  proposal  by 
Kahn  and  MacQueen  of  a  language  for  coroutining  and  programming  of  dynamically  evolving 
networks  of  processes  [KM77].  This  language  was  in  turn  a  refinement  of  a  previous  proposal 
by  Kahn,  which  had  been  presented  together  with  a  mathematical  semantics  in  [Kah74].  In 
the  Kahn  and  MacQueen  model  — as  also  in  the  original —  programs  were  built  around  the 
notion  of  a  process  which  can  communicate  via  unidirectional  channels.  If  we  view  processes 
as  nodes  and  channels  as  directed  edges,  then  at  any  time  the  execution  of  a  program  defines 
a  network  of  communicating  processes.  This  network  may  be  reconfigured  as  the  execution 
continues  by  substituting  a  node  with  an  appropriate  subgraph,  or  by  deleting  a  node  and  all 
edges  incident  upon  it. 

Processes  communicate  by  explicit  calls  to  GET  and  PUT  primitives  naming  the  channel 
and  supplying  the  appropriate  parameters.  These  primitives  also  designate  the  producer  and 
consumer  processes  for  a  channel,  which  in  turn  specifies  the  direction  of  communication.  A 
process  is  the  producer  to  all  channels  to  which  it  issues  a  PUT  command  and  a  consumer  to 
those  to  which  it  issues  a  GET  command.  Note  that  a  process  can  be  both  the  producer  and 
consumer  for  a  channel,  resulting  in  a  cyclic  network  structure. 

In  the  coroutine  mode  of  execution  such  primitives  block  the  issuing  process  and  switch 
control  to  another  process.  Execution  of  the  original  process  will  resume  when  another  blocks 
on  the  same  channel.  On  the  contrary,  a  parallel  mode  of  execution  allows  all  producer  pro- 
cesses to  run  in  a  "non-blocking"  fashion  assuming  unbounded  buffering.  Consumer  processes 
may  block,  depending  on  the  availability  of  data  on  their  input  channel,  but  there  is  no  explicit 
transfer  of  control.  An  intermediate  alternative,  where  an  anticipation  coefficient  is  associated 
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with  a  channel  to  control  unrestrained  execution  of  producers,  was  also  considered  by  the 
developers. 

As  an  example  of  a  program  in  this  model  consider  the  generation  of  an  infinite  list  of 
prime  numbers  using  the  sieve  of  Eratosthenes,  as  presented  in  [KM77].  One  process  is  used  to 
generate  all  positive  integers  and  another  to  spawn  a  "filter"  process  for  each  prime  number, 
which  removes  all  multiples  of  that  prime  from  subsequent  integers.  The  code  fully  appears 
in  figure  2. 


Process  INTEGERS     N  out  SO,  Process  SIFT     in  SS  out  S+, 

repeat  INCREMENT  N;  PUT(N,SO)  forever  Vars  PRIME; 

Endprocess;  GET(S3)  ->  PRIME; 

PUT(PRIME,S4h 
Process  FILTER     PRIME  in  SO  out  S2;  doco  channels  5; 

Vars  X;  FILTER(PRIME,SS,S);  SIFT(S.Si) 

repeat  closeco 

GET(Sl)  -*  X;  Endprocess; 

if  (X  MOD  PRIME)  jt  0  then  PUT(X,S2)  close 
forever  Start  doco  channels  CHI,  CH2; 

Endprocess;  INTEGERS(l,CHl);  SIFT(CHl,CH2) 

closeco; 

Figure  2:  Prime  number  generation  in  the  parallel  language  of  Kahn  and  MacQueen. 

The  formulation  of  the  Kahn  and  MacQueen  model  in  first-order  logic  was  performed  by 
van  Emden  and  de  Lucena  Filho  [vEdLF82].  Their  work  showed  that  all  essential  components 
of  parallel  computation  such  as  process  creation,  termination,  interprocess  communication, 
networks  of  parallel  processes  and  so  on,  are  already  inherent  in  first-order  logic  and  can  be 
expressed  by  simple  logic  statements.  Hence,  apart  from  the  usual  declarative  and  procedural 
interpretation,  logic  programs  were  shown  to  enjoy  yet  another  interpretation:  a  process 
interpretation.  Figure  3  summarizes  the  process  interpretation  of  logic  programs.  A  key 
concept  under  this  interpretation  is  that  activation  of  any  process  is  eligible  besides  those 
blocked  in  an  input  operation  on  an  empty  channel.  This  rule  compensates  for  the  lack  of 
explicit  GET  and  PUT  operations  in  logic. 

Figure  4  shows  the  set  of  logic  statements  required  to  generate  the  list  of  prime  numbers, 
achieving  exactly  the  same  behavior  as  the  code  in  figure  2.1  Figure  5,  in  turn,  shows  the 
series  of  network  configurations  assumed  by  the  program  as  it  starts  executing  the  goal 

«—  (integers(i,c/i!)  &  Mft(c/ii,c/i2)) 

Another  proposal,  which  can  be  thought  of  as  the  functional  equivalent  of  the  van  Emden 
and  de  Lucena  Filho  model,  was  introduced  by  Hanson,  Haridi  and  Tarnlund  [HHT82].  Their 
language  also  featured  static  and  dynamically  evolving  networks  of  processes,  stream  com- 
munication and  computation  depending  upon  the  instantiation  pattern  of  input  arguments. 
Moreover,  it  incorporated  functions  definable  by  equalities  or  conditional  equalities.  The  prime 
number  generator  given  in  [HHT82]  is  the  exact  functional  equivalent  of  the  one  in  figure  4 
and  maintains  the  same  operational  behavior. 


1  Variables  are  denoted  by  terms  starting  with  a  lower-case  letter. 


Parallel  Programming  Logic  Programming 

Process  Goal 

Network  of  processes  Conjunction  of  goals 

Process  creation  Resolving  a  goal  to  a  rule  clause0 

Process  termination  Resolving  a  goal  to  a  unit  clause6 

Process  behavior  rules  Clauses  for  a  predicate 

Perpetual  or  long-lived  process     Tail-recursive  clauses 

Communication  channel  Shared  variable  in  a  conjunction 

Stream/Unbounded  buffer  Incremental  shared  variable0  in  a  conjunction 

Process  state  Unshared  (local)  variables  of  a  predicate 

Synchronization  Unification  constraints  on  input  arguments 


"clause  with  a  nonempty  body;  in  fact,  the  body  should  contain  at  least  one  goal  other  than  a  possible 
recursive  call. 

'clause  with  an  empty  body, 
'partially  instantiated  structure. 

Figure  3:  The  process  interpretation  of  logic  programs. 

{  integers (n,n'. so)  *—  sum(n,J,n')  L  integers(n',5o). 
fUter(prime,x:si,x:s2)  *—  (mod(x, prime, m)  &  m  ^  0)  &  filter(prime, $1,52)1 
filter(prj'me,z:si,S2)  *-  (mod(x, prime, m)  Si  m  =  0)  k.  filter(pnme,s1,S2), 
s\£t(pr\me:s-3,pnme:s^)  «—  filter(pnme,S3,s)  k.  sift(s,S4)  } 

Figure  4:  Generating  primes  in  the  van  Fmden  and  de  Lucena  Filho  model. 

3     IC-Prolog 

IC-Prolog,  developed  at  Imperial  College  by  Clark  and  McCabe  in  1979  [CM79],  embodies 
the  general  tendency  of  that  period  to  break  away  from  the  conventional  depth-first,  left-to- 
right  evaluation  strategy  of  Prolog.  Its  most  significant  difference  from  Prolog  is  the  use  of 
program  annotations  to  control  the  computation.  Programming  in  IC-Prolog  can  be  thought 
of  as  a  two-step  process:  in  the  first  step  the  programmer  concentrates  on  the  logic  part 
of  his  program;  in  the  second  he  chooses  the  appropriate  control  strategy.  This  two-phase 
programming  methodology  is  in  accordance  with  Kowalski's  "Algorithm  =  Logic  +  ControF 
philosophy  [Kow79]. 

By  placing  no  consideration  on  the  control  component  of  an  IC-Prolog  program,  its  exe- 
cution strategy  defaults  to  a  Prolog  style  evaluation.  Some  of  the  non-Prolog  style  evaluation 
mechanisms  that  can  be  achieved  by  IC-Prolog  annotations  are  presented  in  [CMG82]  and 
described  with  examples  below: 

Asynchronous  Parallel  Evaluation:  This  is  achieved  by  replacing  the  sequential  "&" 
connective  by  the  parallel  "//"  between  a  number  of  conjunctive  goals.  If  the  goals  do  not 
share  any  output  variables,  they  can  be  considered  as  asynchronous  (quasi-)  parallel  processes. 
For  example,  the  call 
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g2«=2J:5:sZ3 
Figure  5:  Sequence  of  network  configurations  for  the  first  prime  numbers. 


member(r,lis(/)  //  meniber(r,/isr2) 

forks  two  independent  processes  to  test  whether  element  x  is  a  member  of  both  listl  and  Hst2. 
If  either  process  fails,  the  call  fails  as  well.  If  both  processes  complete  successfully,  the  call 
succeeds.  The  two  calls  to  member  are  independent,  since  they  do  not  share  common  output 
variables. 

A  different  kind  of  asynchronous  parallel  evaluation  can  result  if  a  //-connected  conjunc- 
tion of  goals  shares  output  variables.  Consider  the  call 

profile(pro/,lree/)  //  profi.le(prof,tree2) 

which  attempts  to  determine  whether  two  tree  structures  treel  and  tree2  are  isomorphic  by 
comparing  their  corresponding  profiles  on  an  element-by-element  basis.  We  would  like  the 
call  to  fail  as  soon  as  an  inconsistency  between  the  two  profiles  is  detected.  The  above 
call  achieves  exactly  this,  by  setting  up  a  two-way  communication  channel,  represented  by 
the  partially  instantiated  fist  prof,  between  the  two  parallel  processes  having  the  following 
behavior:  when  either  one  process  further  instantiates  prof,  the  other  should  consult  and 
agree  with  the  binding;  if  it  does  not,  a  mismatch  has  been  detected  and  both  processes  are 
aborted.  Which  process  will  further  instantiate  the  shared  variable  next  is  insignificant  and 
depends  on  the  processes'  relative  speed  and  the  scheduling  policy  effective.  Note,  however, 
that  no  race  conditions  may  occur,  since  parallelism  is  simulated  in  a  time-sharing  fashion 
with  a  time-slice  sufficient  to  complete  at  least  one  resolution  step. 

Parallelism  with  Directed  Communication:  The  two-way  communication  scheme  above 
can  be  restricted  to  a  one-way  (directed)  communication  with  a  designated  producer  and 
consumer  by  decorating  the  occurrence  of  the  shared  variable  in  the  producer  process  by  a 
"|"  annotation  or  the  corresponding  occurrence  in  the  consumer  process  by  a  "?"  annotation. 
In  the  call 

profile(pro/l,lreei)  //  profile(/>ro/,iree2) 


the  first  profile  process  acts  as  the  producer  for  variable  prof,  whereas  the  second  acts  as  its 
consumer.  The  second  profile  process  will  suspend  in  an  attempt  to  instantiate  prof.  The 
same  behavior  could  have  resulted  by  annotating  prof  as  prof?  in  the  second  profile  call. 

Data-flow  Coroutining  (lazy  evaluation):  When  the  relative  speed  of  two  processes  pro- 
ceeding in  parallel  is  not  comparable,  it  is  possible  that  the  producer  process  runs  arbitrarily 
ahead  of  the  consumer  producing  data  which  may  or  may  not  be  useful  to  the  consumer. 
In  cases  where  a  tight  one-to-one  cooperation  between  producer  and  consumer  is  desired,  a 
technique  known  as  Jazy  evaluation  is  called  for.  In  such  a  mode  of  evaluation,  control  is 
switched  back-and-forth  in  a  coroutine  fashion  depending  on  the  flow  of  data.  The  effect  of 
lazy  evaluation  is  achieved  in  IC-Prolog  by  using  the  sequential  "&"  connective  and  decorating 
the  shared  variable  with  the  appropriate  annotation.  As  an  example  consider  the  two  calls 

integers(7,:n<s|)  //  sift(tn<s, primes) 
and 

integers(/,:n/s|)  &  sift(tnfs, primes) 

producing  an  infinite  list  of  prime  numbers.  Whereas  the  first  call  spawns  two  independent 
processes,  one  for  generating  an  infinite  list  of  positive  integers  and  one  for  screening  them,  the 
second  one  introduces  no  forking.  It  maintains  only  one  process  active  at  any  one  time  which 
alternates  between  the  two  goals.  The  "  j"  annotation  makes  the  first  goal  a  iazy  producer  of 
the  instantiations  of  the  shared  variable  ints. 

Backtrackable  Coroutining:  If  during  coroutine  evaluation  of  two  goals,  the  consumer  goal 
fails  for  the  given  value  acquired  from  its  producer,  the  producer  can  backtrack  and  attempt 
to  derive  an  alternative  value  for  that  shared  variable.  The  coroutine  mode  of  execution 
may  subsequently  resume  normally.  As  the  following  example  illustrates,  coroutines  enhanced 
with  such  a  backtracking  mechanism  can  result  in  a  very  expressive  programming  idiom  for 
generate-and-test  programs.  The  example  appeared  in  [CMG82]  and  presents  the  top-level 
clauses  for  deriving  a  solution  to  the  8-queens  problem.  A  solution  instance  to  this  problem 
is  represented  as  a  permutation  of  the  list  of  numbers  1  to  8,  where  the  t'-th  number  in  the 
permutation  is  the  column  position  of  the  queen  in  the  t'-th  row.  The  clauses  are 

queens(x)  «-  perm(l  .S.S.4.5.6.  7. 8. Nil, x])  &  safe(x) 

s&(e(q.qs)  «—  notattack(g.gs)  //  safe(<js) 

For  each  new  number  placed  on  variable  x|  by  the  producer,  the  corresponding  new  queen  q  is 
checked  for  safety  against  all  other  queens  qs  already  on  the  board,  by  means  of  the  predicate 
notattack.  If  at  any  time,  predicate  notattack  reports  failure  in  proving  the  new  queen's 
position  safe,  a  backtracking  process  is  initiated  which  prompts  the  producer  (perm)  to  alter 
the  last  instantiation  to  x]. 

Control  annotations  can  also  be  used  in  the  head  of  clauses  resulting  in  various  indexing 
schemes,  useful  in  controlling  exploration  of  the  search  space.  Among  other  useful  features 
of  IC-Prolog  was  the  treatment  of  I/O  predicates  as  predicates  dealing  with  streams  and  the 
incorporation  of  guards  as  a  means  of  delaying  the  communication  of  bindings  in  a  conjunction 
of  goals.  More  specifically,  the  purpose  of  the  ":"  annotation  in  the  clause 

H  <-GikG2k...kGn:B1kB2k...kBm 

was  to  delay  the  communication  of  variable  bindings  which  result  from  the  unification  with 
the  head  of  the  clause  H,  until  after  the  successful  evaluation  of  goals  Gi,C?2,  •  •  •  ,Gn.   The 
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concept  of  guards  was  inspired  from  Dijkstra's  guarded  commands  [Dij75],  although  guarded 
IC-Prolog  clauses  are  also  subject  to  backtracking.  This  means  that  any  number  of  clauses 
(rather  than  just  a  single)  whose  guards  evaluate  to  true  might  be  used  to  solve  a  goal  via 
backtracking. 

IC-Prolog  seemed  to  be  a  well-suited  vehicle  for  programming  in  logic  and  in  particular 
for  expressing  algorithms  that  are  intrinsically  concurrent  or  can  be  more  efficiently  presented 
by  means  of  some  form  of  non-sequential  evaluation.  Unfortunately,  the  implementation  of  the 
language  turned  out  to  be  surprisingly  complex  mainly  due  to  the  presence  of  backtracking. 
Moreover,  as  Gregory  reports,  the  nature  of  the  backtracking  mechanism  supported  by  the 
language  allowed  only  a  global  implementation  scheme2  which  is  known  to  be  unsuitable  for 
an  efficient  realization  on  parallel  architectures. 


4     The  Relational  Language 

The  Relational  Language  (RL),  developed  by  Clark  and  Gregory  at  Imperial  College  [CG81], 
came  to  remedy  some  of  the  deficiencies  of  IC-Prolog,  yet  at  a  cost.  Backtracking  was  elim- 
inated as  too  costly  in  terms  of  implementation  in  a  parallel  environment.  The  notion  of 
nondeterministic  commitment  to  a  single  clause  could  now  be  introduced  among  guarded 
clauses  for  the  same  predicate  and  be  defined  in  the  same  manner  as  the  selection  of  a  single 
statement  list  in  Dijkstra's  guarded  commands.  This  decision  was  also  influenced  by  the  el- 
egant adoption  of  Dijkstra's  formalism  in  Hoare's  CSP  parallel  programming  model  [Hoa78]. 
A  fundamental  consequence  of  adopting  such  a  commitment  rule  was  that  variables  became 
single-assignment  since  control  could  not  backtrack  past  already  performed  variable  bindings. 
This  key  feature  allows  an  efficient  implementation  in  a  parallel  environment. 

Moreover,  the  term  annotations  of  IC-Prolog  were  substituted  by  a  different  model  of 
achieving  communication  among  processes:  mode  declarations.  Each  predicate  definition  in 
RL  is  associated  with  one  or  more  mode  declarations,  each  one  of  which  takes  the  form 

mode  P^! ,  m2, . . . ,  mn) 

where  P  is  the  predicate  name  of  arity3  n  and  each  m,  is  either  a  "?"  or  a  "f.  The  "?" 
symbol  specifies  an  input  argument  position,  whereas  the  "f  an  output  argument  position. 

The  way  communication  constraints  are  imposed  by  mode  declarations  follows: 

•  Unification  of  the  head  of  a  clause  with  a  call  should  succeed  without  instantiating  a 
variable  appearing  in  an  input  argument  position  of  the  call  to  a  non-variable  term. 

•  If  performing  such  an  instantiation  is  the  only  way  of  achieving  unification,  the  compu- 
tation suspends  awaiting  further  instantiation  of  the  input  variable. 

Arguments  at  input  positions  are  strong'm  RL,  i.e.  they  are  fully  constructed  by  their  producer. 
An  output  argument  position,  on  the  other  hand,  indicates  that  the  term  corresponding  to 
this  argument  can  only  be  used  for  instantiating  a  variable  argument  in  the  call. 

2a  scheme  in  which  upon  failure  of  a  process,  the  whole  computation  of  this  process  and  that  of  any  other 
processes  with  which  this  process  interacted  is  undone,  until  the  last  choice  point  of  the  failed  process. 
3number  of  arguments  of  the  predicate. 


The  adoption  of  guarded  commands  made  RL  sufficiently  expressive  to  emulate  Hoare's 
CSP  and  Brinch  Hansen's  Distributed  Processes  (DP)[Bri78]  closely  enough.  Both  these  par- 
allel programming  models  are  of  imperative  nature,  view  the  process  as  a  fundamental  concept 
in  program  construction  and  the  exchange  of  messages  as  the  means  of  process  communication 
and  synchronization.  Both  models  directly  influenced  the  tasking  structure  of  Ada.  Figure  6 


gcd  ::  mode    T  gcd-of  <?,?> 

*[  x  >  y  — ►  x  :=  x  —  y  z  gcd-of  <z,z> 

Wv>x  —  V-=y-x  z  gcd-of  <x,y>  —  x>  y  \  xl\&  x  —  y  h  z  gcd-of  <xl,y> 

]  z  gcd-of  <x,y>  «-  y  >  x  \  yl  is  y  -  x  k.  z  gcd-of  <x,yl> 

Figure  6:  CSP  and  RL  code  for  the  greatest  common  divisor  problem. 

shows  the  CSP  program  and  the  corresponding  RL  program  for  the  greatest  common  divisor 
problem.  The  left  part  of  figure  7  shows  the  DP  code  of  a  monitor4  process  which  schedules 
the  access  to  an  abstract  resource  as  given  by  Brinch  Hansen  in  [Bri78].    The  right  part  of 


process  resource  mode  resourcemonitor(  ?) 

free  :  bool  resourcemonitor(reguejfi)  «—  resource( requests,  True) 
proc  request 

when  free  :  free  :=  false  end        mode  resource(  ?,  ?) 

proc  release  resource(  Occupy. requests,  True)  •—  resource(  requests,  False) 

if  not  free  :  free  .=  true  end         resource(/?e/ease. requests, False)  —  resource( requests,  True) 

free  :=  true  resource(A'if,_) 

Figure  7:  DP  and  RL  code  for  a  resource  scheduling  monitor. 

the  figure  illustrates  the  representation  of  the  same  process  by  means  of  tail-recursive  clauses. 
The  first  argument  of  the  resource  predicate  is  the  stream  of  monitor  calls  and  the  second 
the  internal  state  of  the  monitor  (contained  in  variable  free  in  the  DP  code).5 

The  elimination  of  backtracking  had  consequences  on  the  style  of  programming  in  RL. 
Programmers  had  to  guarantee  that  the  guard  part  of  alternative  clauses  was  sufficient  to  des- 
ignate the  appropriate  clause.  Whereas  in  practice  such  property  is  not  difficult  to  maintain, 
it  is  definitely  a  drastic  departure  from  conventional  Prolog  programming  style. 

The  RL  model  fits  very  closely  to  the  parallel  programming  formalism  of  van  Emden  and 
de  Lucena  Filho  as  well.  As  an  immediate  consequence,  RL  clauses  can  also  be  thought  of 
as  specifications  of  a  dynamic  network  of  processes;  in  the  same  way,  shared  variables  among 
goals  can  be  thought  of  as  communication  channels  with  mode  declarations  indicating  the 
direction  of  communication.  The  prime  number  generator  of  figure  4  can  be  written  in  RL  in 
a  very  similar  fashion. 

As  an  example  of  an  RL  program,  consider  a  system  with  two  client  processes  and  one 
server  process  serving  the  users'  requests  and  replying  back  to  their  messages6.    All  three 


4an  encapsulation  facility  providing  mutual  exclusion  and  synchronization  on  shared  data  accesses, 
'term  "_"  represents  an  anonymous  variable. 

6The  server  can  as  well  represent  a  shell  and  the  clients  the  users  in  a  multiprogramming  operating  system, 
as  illustrated  in  [CG81]. 
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processes  are  executing  concurrently.   The  network  of  processes  representing  the  above  sys- 
tem in  the  RL  model  is  depicted  in  figure  8.   Two  additional  processes  for  merging  the  two 


diatl 


dienO 


Figure  8:  A  network  of  two  clients  with  a  single  server. 

request  streams  into  a  single  stream  and  subsequently  distributing  the  single  replies  stream 
into  two  streams  are  required  as  illustrated  by  the  figure.  The  RL  code  for  this  configuration 
of  processes  is 

...  clientl(requestsJ] , replies  1)  //  client2(requests2] ,replies2)  // 

t&ggedmerge(requestsl, requests!!, requests])  //  server(requests, replies])  // 
tagged xnerge(repliesl  ]  ,replies2] , replies)  . . . 

The  first  client  process  executes 

mode  clientl(f,?) 

client  1(  re  quest. requests, reply. replies)  <— 

requestl(regues<f)  //  replyl(rep/y)  //  c\ientl(requests] , replies) 

and  the  second  the  equivalent  code.  The  two  client  processes  may  even  be  identical,  in  which 
case  they  invoke  the  same  request  and  reply  predicates. 

Finally,  the  tagged  stream  merging  and  the  corresponding  distribution  is  handled  by  a 
single  predicate  taggedmerge  using  its  two  different  modes  of  operation  [CG81],  as  illustrated 
in  figure  9.  Predicates  such  as  taggedmerge  with  the  first  mode,  introduce  time  dependent 
computation,  a  component  that  has  been  claimed  essential  for  dealing  with  real-time  systems 
applications.  In  fact,  taggedmerge  is  used  to  join  two  streams  of  values  produced  by  two 
parallel  processes  into  a  single  stream  in  a  nondeterministic  fashion  as  soon  as  they  are  prop- 
agated from  their  producer.  But  then  the  order  of  elements  in  the  stream  depends  on  the 
relative  speed  of  the  two  producer  processes. 
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mode  taggedmerge(  ?,?,]) 

mode  taggedmerge(|,f,?) 

tagged merge(z. is, ys,<l,x>.zs)  «—  taggedmerge( xs, ys,zs) 

taggedmerge(xs, y. ys, <2, y>  .zs)  *—  taggedmerge(rs, ys,zs) 

Figure  9:  Tagged  stream  merging  in  Relational  Language. 

Finally,  RL  allowed  for  exact  specification  of  the  buffer  size  of  a  given  communication 
channel  besides  the  default  infinite  size.  For  this  purpose  "  j"  annotations  could  be  accompa- 
nied by  an  integer  indicating  the  amount  of  data  the  producer  could  generate  ahead  of  the 
consumer.  As  an  interesting  variation  of  the  bounded  buffering  scheme,  one  could  readily 
obtain  a  queueing  system  with  m  servers  with  a  single  queue  of  capacity  C  using  a  call  of  the 
form 

requests(rslC)  //  merge^  1 0, ...,  sm]0,  rs)  //  server(«i)  //...//  server(sm) 

as  shown  in  [CG81].  An  analogous  system  with  a  queue  of  capacity  C  associated  with  each 
server  could,  instead,  be  obtained  using  a  call  of  the  form 

requests(rs|0)  //  merged  TC, . . .  ,sm]C,  rs)  //  server^)  //...//  server(sm) 

5      Concurrent  Prolog 

The  communication  constraints  of  RL  were  actually  too  strong.  Processes  could  commu- 
nicate among  themselves  by  sending  only  fully  instantiated  data.  Thus,  channels  were  in 
effect  unidirectional.  As  Gregory  explains  this  decision  reflected  their  initial  goal  of  efficiently 
implementing  the  language  on  a  loosely  coupled,  message  passing  architecture. 

Soon  after  the  proposal  of  RL,  two  new  language  designs  were  prompted,  both  of  which 
had  relaxed  the  tight  communication  constraints  of  RL  to  allow  for  cooperative  instantiation 
of  incremental  shared  terms,  also  known  as  the  "logical  variable"  mechanism.  Their  names 
are  Concurrent  Prolog  and  Parlog  and  they  are  the  subject  of  this  and  the  next  section 
respectively. 

Concurrent  Prolog  (CP),  developed  by  Shapiro  at  the  Weizmann  Institute  [Sha83],  inher- 
ited RL's  guarded  clauses  and  committed-choice  nondeterminism  but  used  a  different  mecha- 
nism to  constrain  communication.  Mode  declarations  were  eliminated;  modes  of  use  were  no 
longer  specified  by  the  procedure,  but  mostly  by  the  call.  Communication  constraints  were 
expressed  by  decorating  terms  with  read-only  "?"  annotations.  Quoting  from  Shapiro  [Sha83, 
page  11]: 

The  unification  of  terms  containing  read-only  variables  is  an  extension  to  normal  unifica- 
tion. The  unification  of  a  read-only  term  X?  with  a  term  Y  is  defined  as  follows.  If  Y  is 
a  non-variable,  then  the  unification  succeeds  only  if  X  is  a  non-variable,  and  X  and  Y  are 
recursively  unifiable.  If  Vis  a  variable  then  the  unification  of  X?  and  Y succeeds,  and  the 
result  is  a  read-only  variable.  The  symmetric  algorithm  applies  to  X  and  Y?. 

This  definition  of  unification  is  clearly  time-dependent.  A  unification  that  fails  to  satisfy 
any  of  the  conditions  above  may  succeed  later  when  another  process  instantiates  a  shared 
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variable  to  a  term.  For  this  reason,  Shapiro  describes  unification  as  a  continuous  activity 
which  terminates  only  upon  success  or  upon  detection  of  failure  not  attributed  to  read-only 
constraints. 

Using  the  read-only  annotation,  the  list  partitioning  program,  whose  Prolog  version  is 
shown  in  figure  1,  appears  in  figure  10. 7  A  variable  annotated  as  read-only  can  be  regarded  as 

partition(X,[L\L8],[L\Xs],Ys)  :- L  =<  X  |  partition(AT,  Ls?,Xs,Ys). 
partition{X, fL\Ls],Xs,[L\ YsJ)  :-  L  >  X  |  partition(X, Ls?,Xs,Ys). 
paxtitum(X,[],[],(]). 

?-  partition(X?,L?,Ll,L2). 

Figure  10:  List  partitioning  in  Concurrent  Prolog 

a  protected  data,  structure  since  it  can  be  read  by  any  process,  but  can  be  bound  only  through 
the  instantiation  of  its  corresponding  write-enable  variable8.  This  property  of  read-only  terms 
has  occasionally  proven  quite  useful,  as  in  the  case  of  expressing  parallel  algorithms  in  CP 
[Hel84]  and  in  the  case  of  implementing  bounded  buffer  communication  for  CP  [TF85].  On 
the  other  hand,  this  feature  implies  that  synchronization  is  expressed  at  a  data-level  [TF86b]. 
Compared  with  the  procedure-level  expression  of  synchronization  in  RL,  the  CP  approach 
may  complicate  program  understanding.  Furthermore,  as  Saraswat  [Sar86]  points  out,  the 
behavior  of  a  predicate  depends  on  the  annotations  of  the  arguments  of  its  calls,  so  that  a 
static  analysis  appears  insufficient  to  prove  any  properties  about  CP  programs. 

5.1      Programming  with  the  Logical  Variable 

The  programming  examples  we  saw  so  far  can  be  expressed  in  an  equally  elegant  fashion  in 
a  number  of  other  concurrent  functional  and  stream-based  models  proposed.  A  feature  that 
is  unique,  however,  in  logic  programming  is  the  logical  variable,  which  can  be  viewed  in  a 
number  of  useful  ways  [Sha86a]: 

•  An  incrementally  sent  data  structure. 

•  A  cooperatively  constructed  data  structure. 

•  A  message  containing  a  communication  channel  routing  back  to  the  sender. 

The  first  interpretation  was  exploited  to  represent  the  potentially  infinite  stream  of  monitor 
calls  in  the  example  of  figure  7.  The  other  two  interpretations  imply  bidirectional  communi- 
cation and  passing  variables  in  messages,  both  disallowed  by  the  functional  character  of  RL. 
Their  elegant  use  will  be  clarified  by  constructing  a  queue  monitor  which  accepts  a  stream  of 
enqueue  and  dequeue  messages  and  performs  the  appropriate  action,  enqueue  messages  contain 
the  term  to  be  inserted  in  the  queue,  while  dequeue  messages  an  uninstantiated  variable  to  be 
bound  to  the  term  at  the  head  of  the  queue.  The  queue  monitor  is  represented  as  a  long-lived 
process  and  its  definition  appears  in  figure  11  [Sha86b].  Clearly,  the  stream  of  requests  is  an 


'Variables  in  CP  are  designated  by  an  upper-case  first  letter. 
8the  same  variable  without  the  "?"  annotation. 


5.1     Programming  with  the  Logical  Variable  13 


queuemonitor(Z?e?ues<s)  :-  queue(Requests?, Queue, Queue). 

<\ueue([enqueue(X)\Requests],Head,[X\TailJ)  :-  queue(Requests?, Head, Tail). 
queue([dequeue(X)\Requests],[X\Head],Tai[)  :-  q\ie\ie(Requests?,Head, Tatt). 
queue(/7,__). 

Figure  11:  A  queue  monitor  using  incomplete  messages  in  Concurrent  Prolog 

incrementally  bound  data  structure,  since  the  queue  monitor  callers  are  sending  their  messages 
at  arbitrary  times.  It  is  also  cooperatively  constructed,  since  both  the  callers  and  the  queue 
process  (in  the  case  of  dequeueing)  are  binding  the  stream.  Finally,  when  issuing  a  dequeue 
message,  the  caller  sends  along  with  it  an  uninstantiated  variable  which  is  a  communication 
channel  leading  back  to  the  caller.  All  three  views  of  the  logical  variable  are  combined  to  pro- 
duce this  elegant  version  of  the  queue  monitor.  Notice  that  as  process  interpretation  dictates, 
the  process  state  of  the  monitor  (the  queue  data  structure)  is  represented  by  local  variables 
of  the  predicate  (Head  and  Tail). 

This  queue  monitor  example  possesses  yet  another  interesting  property:  if  a  dequeue 
message  on  an  empty  queue  is  issued,  computation  does  not  suspend;  instead,  a  variable  term 
is  returned  to  be  instantiated  upon  a  subsequent  enqueue  operation.  Whenever  the  actual 
value  of  the  dequeued  item  is  needed,  computation  awaits  the  next  enqueueing.  This  scheme 
entails  more  parallelism  since  the  dequeuer  is  not  suspended  on  an  empty  queue  unless  really 
necessary.  We  can  say  that  we  obtain  negative  size  queues.  In  the  same  way,  a  process  may 
enqueue  an  uninstantiated  variable  to  be  bound  later  in  the  computation.  This  variable  may 
subsequently  be  dequeued  by  another  process.  In  such  cases,  the  queue  process  serves  merely 
as  a  linker  of  the  two  processes  enabling  them  to  setup  their  own  communication  channel.  The 
correlation  of  such  processes  is  depicted  in  figure  12.  This  interesting  capability  of  the  above 


Figure  12:  Enqueueing  and  dequeueing  an  uninstantiated  variable. 

clauses  was  first  brought  up  by  Shapiro  and  Takeuchi  [ST83]. 

In  a  quite  similar  use  of  the  logical  variable,  one  can  obtain  a  neater  representation  of  the 
client/server  network  of  section  4.  The  call  in  CP  can  simply  be 

...  clientl(requestsl),  client2(requests2), 

merge(requesisl ?, requests??, requests),  server(requests?)  . . . 

where  the  requestsl  and  requests2  streams  contain  uninstantiated  variables  serving  as  commu- 
nication channels  routing  back  the  replies.  The  stream  merging  does  not  need  to  be  tagged 
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merge([X\Xs],Ys,[X\ZsJ)  :-  metge{Xs?,Ys,Zs). 

merge(Xs,[Y\Ys],[Y]ZsJ)  :-  merge(X  s,Ys?,Zs). 
merge([],Ys,Ys). 
merge(  Xs,[],  Xs) . 


Figure  13:  Nondeterministic  stream  merging  in  Concurrent  Prolog. 

anymore.    Its  clauses  in  CP  are  given  in  figure  13  [Sha83].    Finally  the  resulting  network 
configuration  is  depicted  in  figure  14. 


Figure  14:  A  network  of  two  clients  with  a  single  server  using  incomplete  messages. 


5.2      Metaprogramming  in  Concurrent  Prolog 

Metaprogramming  is  a  broadly  known  and  used  technique  in  the  logic  programming  commu- 
nity and  refers  to  the  ability  of  logic  programs  to  treat  programs  as  data  and  evaluate  data 
as  programs.  These  facilities  were  actually  developed  and  understood  under  the  functional 
programming  context,  the  only  other  model  that  supports  metaprogramming.  Metaprogram- 
ming allows  the  easy  development  of  interpreters,  compilers,  debuggers  and  other  useful  tools 
requiring  source  programs  as  input.  A  particularly  interesting  interpreter  for  a  given  language 
is  the  interpreter  written  in  the  language  itself.  Such  interpreters  are  called  meta-interpreters 
and  are  particularly  easy  to  write  for  logic  programming  languages.  Figure  15  illustrates 
one  such  meta-interpreter  for  CP,  originally  described  by  Shapiro  [Sha86b,SS86].  This  meta- 
interpreter  takes  as  input  a  set  of  clauses  denoted  as  Program  and  a  goal  to  solve  with  respect 
to  these  clauses.  The  first  reduce  clause  simply  states  that  the  goal  true  succeeds  under 
any  Program.  The  second  clause  decomposes  goal  conjunctions,  whereas  the  third  deals  with 
each  individual  call.  Each  call  is  resolved  using  an  appropriate  clause  in  Program,  the  body 
of  which  is  subsequently  chosen  to  be  reduced,  resolve  produces  a  copy  of  each  clause  with 
fresh  variables  and  attempts  to  unify  its  head  with  the  call  and  to  further  reduce  the  guards  of 
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reduce(  Progra  m,irue). 

reduce(Program,(A,B))  :-  reduce(Program?,A  ?),  reduce(Program?,B?). 

reduce(Program,A)  :- not(system^?j)  | 

re8olve(Progrum?, A?, Program?, Body),  Teduce(Program?,Body?). 
reduce(Program,A)  :-  system(.4  ?)  \  A. 

resolve(Program,A,[Clause\Clauses],Body)  :- 

copy(Clause?,(Head:-Guard\Body)),  A  =  Head,  reduce(Program, Guard?)  |  true. 
Tesolve(Program, A, [^Clauses], Body)  :-  resolve(Program, A, Clauses?, Body)  |  true. 

Figure  15:  A  meta-interpreter  for  Concurrent  Prolog. 

the  clause.  Finally,  the  last  reduce  clause  identifies  system  calls  and  executes  them  directly 
by  means  of  the  metavariable  call.  The  metavariable  call  is  similar  in  character  to  Lisp's  eval 
function  and  is  useful  in  executing  dynamically  determined  code.  In  fact,  assuming  Program 
is  globally  accessible,  the  entire  CP  meta-interpreter  of  figure  15  could  be  rewritten  as  the 
single  clause 

reduce(X)  :-  A. 

by  making  use  of  the  metavariable  facility.  In  fact,  as  Shapiro  [Sha86a]  points  out,  a  meta- 
interpreter  can  be  expressed  at  different  levels  of  granularity.  A  choice  is  made  according  to 
the  needs  in  practice.  For  example,  the  one  described  in  figure  15  is  being  explicit  in  terms 
of  the  CP  goal  reduction  mechanism,  but  lets  unification  be  expressed  implicitly  by  means  of 
the  call  A  =  Head  in  the  body  of  resolve.  Such  level  of  description  could  thus  be  the  starting 
point  for  a  CP  interpreter  that  provides  an  enriched  reduction  strategy.  If  unification  had 
been  our  focus  as  well,  it  should  have  been  explicitly  defined  as  a  set  of  meta-rules  unify, 
similar  in  nature  to  those  of  reduce  and  resolve.  However,  as  the  grain  becomes  finer,  the 
difficulty  of  the  meta-interpreter  grows  disanalogously  and  the  resulting  loss  of  efficiency  often 
outweighs  its  potential  advantages. 

Observe  for  example  in  figure  16  a  CP  meta-interpreter  enhanced  to  be  able  to  terminate  a 
computation  upon  receipt  of  an  external  abort  signal.  The  clauses  for  reduce  and  resolve  are 

reduce(  Program,  tru  e,  A  bort) . 

red\ice(Program,(A,B),Abort)  :-  Teduce(Program?,A?,Abort?),  reduce(Program?,B?, Abort?). 

reduce(Program,A,Abori)  :-  variable(.4 bort?),  xxot(systein(A?))  | 

resolve( Program?, .4  ?, Program ?, Body, Abort?),  reduce{Program?, Body?, Abort?). 
Ted\ice(Program,A,Abort)  :-  variable(>16or<?),  system(yi?)  |  A. 
reduce(Program,A,abort). 

resolve(Program,  A,  [Clause\Clauses],  Body,  Abort)  :-  copy  (Clause?,  (Head.— Guard\Body)), 

A  =  Head,  reduce( Program, Guard?, Abort)  |  true. 
resolve( Program, A, [^Clauses], Body, Abort)  :- 

resolve( Program, A, Clauses?, Body, Abort)  |  true. 

Figure  16:  An  abortable  meta-interpreter  for  Concurrent  Prolog. 
only  slightly  altered  to  account  for  this  new  requirement.  The  variable  predicate  is  a  built-in 


16  5     CONCURRENT  PROLOG 

predicate  checking  whether  its  argument  is  an  uninstantiated  variable.  This  and  a  number  of 
other  useful  enhancements  to  the  standard  CP  model,  including  goal  termination  detection, 
interrupt  handling,  deadlock  detection  and  several  others  are  described  as  meta-interpreters 
in  an  illuminating  paper  by  Safra  and  Shapiro  [SS86]. 

While  the  power  obtained  by  simple  interpreters  such  as  the  one  in  figure  16  is  fascinating, 
they  perform  very  poorly  in  terms  of  efficiency.  A  meta-interpreted  CP  goal  may  experience 
a  ten-  or  even  twenty-fold  slowdown.  A  remedy  to  this  situation  is  provided  by  a  transforma- 
tion technique,  called  partial  evaluation,  whose  theoretical  grounds  can  be  traced  back  to  the 
parameter  theorem  of  recursive  functions.  In  the  logic  programming  context,  partial  evalua- 
tion may  be  employed  to  remove  unnecessary  layers  of  interpretation.  Takeuchi  and  Furukawa 
[TF86a]  investigated  the  application  of  partial  evaluation  in  the  case  of  Prolog.  They  presented 
a  partial  evaluator  which,  when  given  an  interpreter  and  a  set  of  clauses,  produces  a  program 
which  behaves  identically  to  the  clauses  run  under  the  interpreter.  Hence,  the  resulting  pro- 
gram has  all  the  added  functions  but  avoids  the  interpretation  overhead.  Partial  evaluation 
proceeds  by  traversing  the  computation  tree  for  the  given  goal,  propagating  statically  known 
variable  values  and  even  substituting  calls  by  their  corresponding  bodies,  whenever  this  is 
possible.  Semantic  equivalence  between  the  original  and  the  partially  evaluated  goal  can  be 
proven  in  a  theoretical  sense. 

Partial  evaluation  techniques  can  also  be  applied  for  CP,  although  there  is  yet  no  formal 
proof  of  equivalence.  Figure  17  shows  a  sample  CP  program  for  the  reverse  relation  and 
its  corresponding  partially  evaluated  version  with  respect  to  the  abortable  meta-interpreter  of 
figure  16  [SS86].  Notice  that  the  transformations  between  the  two  program  versions  of  figure  17 


reverse([X\X s],Ys)  :-  reverse([X\Xs],  Ys,  Abort)  :- variable(/l&or<?)  | 

reverse(As?,Zs),  Teverse(Xs?,Zs,  Abort?), 

append{Zs?,[X?],  Ys).  append(Zs?,[X?J,  Ys,Abort?). 

reverse( [],[]).  reverse([ ],[], Abort)  :-  variab\e(Abort?)  |  true. 

reverse(A's,  Ys, abort). 

append([X\Xs],Ys,[X\ZsJ)  :-  append([X\Xs],Ys,[X\Zs],Abori)  :-  variable^fcort?)  | 

append(Xs?,  Ys?,Zs).  append(JVs?,  Ys?,Zs, Abort?). 

append(/7,  Ys,  Ys).  append{f],Ys,Ys, Abort)  :-  variable(Xfcorl?)  |  true. 

append(A's,  Ys,Zs, abort). 

Figure  17:  Plain  and  abortable  version  of  the  reverse  relation  in  Concurrent  Prolog. 

mirror  the  respective  transformations  between  the  plain  and  the  abortable  meta-interpreter 
of  figures  15  and  16. 


5.3     Problems  with  Concurrent  Prolog 

The  unification  algorithm  of  CP,  as  specified  in  [Sha83],  turned  out  to  present  major  draw- 
backs. After  a  series  of  discussions  and  research  on  the  subject,  results  of  which  are  reported 
in  [Ued86a]  and  [Sar86],  the  following  appeared  to  be  some  of  the  problems  associated  with 
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its  definition: 

•  Unification  was  order-dependent;  depending  on  the  order  under  consideration,  unifica- 
tion could  succeed,  suspend  or  even  cause  deadlock.  In  addition,  a  single  unification  was 
able  to  "feed  itself  by  simultaneously  writing  on  a  write-enable  variable  and  reading 
from  its  corresponding  read-only  variable  [RS86]. 

•  The  specification  of  read-only  variables  in  the  head  was  left  undefined;  proposals  pro 
and  against  its  abolition  from  the  language  were  ambivalent. 

•  The  result  of  unifying  two  read-only  terms  was  left  unspecified. 

In  a  revised  definition  of  unification  [Sha86a]  prompted  by  the  above  criticisms,  the  order 
dependence  issue  was  resolved. 

The  definition  of  the  commitment  operation  in  CP  requires  a  "multiple  environment" 
mechanism.  Each  guard  evaluation  maintains  a  local  binding  environment  which  contains 
a  copy  of  the  variables  in  the  call  in  addition  to  the  local  variables  in  the  guard.  In  this 
way,  unification  and  guard  evaluation  of  all  candidate  clauses  can  proceed  in  parallel.  Call 
variables  are  not  copied  back  to  the  global  environment  until  after  a  clause  commits.  In  such  a 
case,  the  bindings  in  the  local  and  global  environment  are  unified.  The  original  CP  document 
[Sha83]  does  not  account  for  the  case  of  failure  in  performing  this  unification.  Ueda  [Ued86a] 
and  Saraswat  [Sar86]  meticulously  pursued  all  the  alternatives,  analyzed  their  behavior  and 
illustrated  them  by  means  of  examples.  The  consensus,  though,  appeared  to  be  that  no 
alternative  was  both  reasonable  and  consistent  with  the  rest  of  the  language.  Additional 
problems  were  associated  with  the  way  instantiations  to  incremental  variables  were  propagated 
from  the  global  to  the  local  environment  as  well... 

On  top  of  these  semantic  pitfalls  of  CP,  a  disappointment  also  arose  with  respect  to  an 
efficient  implementation.  As  Levy  reports  [Lev86],  after  a  number  of  suitable  extensions  and 
enhancements  to  the  language  were  performed  and  a  number  of  clever  techniques  were  sug- 
gested, no  implementation  managed  to  achieve  the  hoped-for  efficiency.  The  next  three  years 
following  CP's  proposal,  several  restricted  ramifications  of  the  language  were  suggested  attack- 
ing various  problem  areas  of  CP.  Some  attempted  to  define  new  annotations,  others  replaced 
the  read-only  annotation  with  a  dual  write-enable  one  and  others  focused  their  attention  on 
assigning  more  static  semantics  to  the  language.  A  brief  account  on  some  of  these  proposals 
can  be  found  in  [Lev86].  Only  Flat  Concurrent  Prolog  (FCP)  [Sha86a],  an  interesting  varia- 
tion of  CP,  will  be  mentioned  here,  since  it  is  the  founding  member  of  a  certain  category  of 
parallel  logic  languages,  as  we  shall  see  later. 

5.4      Flat  Concurrent  Prolog 

FCP  derives  directly  from  CP  by  restricting  the  guards  to  contain  only  calls  to  a  predetermined 
set  of  simple  deterministic  predicates  (e.g.  arithmetic  comparison  predicates).  As  a  result, 
FCP  dispenses  with  tree-structured  local  environments  and  the  language  can  be  efficiently 
implemented.  However,  as  Ueda  points  out  [Ued86a],  it  does  not  eliminate  the  need  for 
multiple  environments,  neither  does  it  resolve  any  of  the  issues  above.  FCP  has  been  claimed 
to  lose  only  little  expressiveness  compared  with  the  full  power  of  CP  [Sha86a].  This  claim 
was  justified  by  the  successful  writing  of  some  nontrivial  programming  projects,  such  as  parts 
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of  an  operating  system,  a  bootstrapping  compiler  and  a  programming  environment  entirely 
in  FCP.  Another  direction  that  has  been  taken  to  this  end,  was  to  attempt  a  source-to- 
source  transformation  from  CP  to  FCP.  The  attempt  proved  feasible  only  for  a  more  static, 
non-flat  variant  of  CP  (called  Safe  Concurrent  Prolog)  and  the  transformation  algorithm  is 
illustrated  in  [CS86]  by  means  of  examples.  The  approach  followed  was  to  transform  a  number 
of  OR-parallel  candidate  clauses  with  non-flat  guards  into  a  single  clause  which  calls  all  the 
guards  in  an  AND-parallel  conjunction  in  its  body.  Whenever  a  guard  of  a  clause  succeeds, 
the  corresponding  body  is  selected  to  execute.  The  resulting  program  can  be  written  in  a  flat 
language.  Due  to  the  nature  of  the  transformation,  this  technique  is  also  known  as  compilation 
of  OR-parallelism  into  AND-parallelism. 


6      Parlog 

The  second  attempt  to  loosen  the  tight  constraints  of  RL  was  Parlog9  [CG83],  developed  by  the 
same  authors  of  RL.  The  motivation  behind  this  new  proposal  was  that  passing  uninstantiated 
variables  in  messages  was  recognized  as  an  expressive  language  feature10,  elegantly  adopted  by 
CP  and  not  difficult  to  implement  on  a  tightly  coupled  architecture.  Parlog83  maintained  the 
mode  declarations  philosophy  but  allowed  input  arguments  to  be  weak,  i.e.  potentially  instan- 
tiated by  more  than  one  call,  giving  ground  for  two-way  communication  over  an  incremental 
term,  also  known  as  back  communication.  As  in  RL,  relations  could  be  accompanied  by  more 
than  one  mode  declaration,  allowing  this  way  multi-use  programs.  Which  mode  was  in  effect 
each  time  could  be  determined  at  compile-time  by  means  of  standard  static  rules  which  could, 
however,  be  overridden  by  explicit  "|"  or  "?"  annotations.  Figure  18  illustrates  how  an  input 

relation  variablemonitor(?). 
variablemonitor(r?gues/s)  :-  variable( requests,.). 

relation  variable(  ?,  ?). 

variable([read(x] )\requests],x)  :-  variable(reguesis,x). 
variab\e([wriie(x)\requests],-)  :-  variable(reguesis,r). 
variable(/7,_) 

Figure  18:  Multiply  assignable  variable  in  Parlog83 

mode  can  be  overriden  to  obtain  partially  instantiated  messages  by  implementing  a  multiply 
assignable  variable  as  a  monitor.  By  direct  extension  of  this  method  one  can  readily  program 
multiply  assignable  arrays,  records  and  so  on. 

The  language  was  divided  into  two  subsets:  the  and-re/ation  subset  and  the  or-rela,tion 
subset.  A  conjunction  of  and-relation  calls  can  be  evaluated  in  parallel  in  exact  the  same 
manner  as  in  CP,  having  shared  variables  act  as  communication  channels.  As  with  the  previous 
committed-choice  guarded  languages,  at  most  one  solution  is  produced  for  each  call.    This 


9This  first  version  of  the  language  has  been  subject  to  substantial  change;   we  shall  refer  to  it  here  as 
Parlog83. 


10 


this  became  particularly  clear  when  the  writing  of  a  bootstrapping  compiler  for  RL  was  attempted. 
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sacrifice  of  completeness  over  Prolog  can  be  very  awkward  when  applications  requiring  more 
than  one  solution,  possibly  all,  are  dealt  with.  Or-relations  are  provided  for  exactly  such  types 
of  applications.  The  evaluation  of  a  conjunction  of  or-relation  calls  proceeds  sequentially 
from  left-to- right  but  explores  all  alternative  computational  paths  in  parallel.  Or-relation 
definitions  are  mode-free  and  their  clauses  may  contain  no  guards.  Or-relation  calls  may 
only  appear  inside  a  set  constructor,  which  provides  the  interface  between  the  two  language 
subsets11.  The  set  constructor  itself  can  be  viewed  as  an  and-relation  i.e.  values  in  the  set  can 
be  concurrently  consumed  by  another  AND-parallel  call  as  soon  as  they  are  generated. 

For  example,  the  set  constructor  collecting  all  solutions  to  the  8-queens  problem  takes 
the  form 

relation  queens(f) 

queens(x)  :-  x  =  {  y  :  penn([l,2,S,4,5,6,7,8],y)  &  safe(y)  } 

where  perm  is  an  or-relation  and  safe  is  an  and-relation.  The  call  to  safe  inside  the  set  con- 
structor can  be  viewed  as  a  "filter"  to  the  exhaustive  perm  relation.  Although  the  behavior  of 
the  above  call  appears  to  be  identical  to  the  IC-Prolog  coroutining  solution  outlined  previously, 
there  does  exist  a  difference.  Due  to  the  AND-sequentiality  of  or-relation  calls,  a  solution  to  y 
should  be  completely  constructed  by  perm  before  a  safe  process  can  be  spawned.  Notice  that 
the  IC-Prolog  solution  allowed  for  such  stream  parallelism  even  when  an  exhaustive  search 
was  performed. 

A  major  difference  of  the  language  with  CP  lies  in  the  operational  semantics  of  com- 
mitment. More  specifically,  Parlog83  adopts  the  concept  of  suspension.  If  unification  of  the 
call  with  the  head  of  a  clause  tries  to  bind  a  variable  of  an  input  argument  of  the  call  to  a 
nonvariable  term,  it  is  suspended.  Moreover,  the  evaluation  of  the  guards  proceeds  in  parallel 
with  unification  so  that  potential  guard  or  unification  failure  is  reported  as  early  as  possible. 
Alternative  clauses  fall  thus  in  either  of  the  following  three  categories  [CG83]: 

•  Candidate  clauses,  whose  unification  and  guard  evaluation  were  both  successful. 

•  Non-candidate  clauses  whose  unification  or  guard  evaluation  (or  both)  failed. 

•  Input  suspended  clauses  whose  unification  or  guard  evaluation  suspended,  awaiting  fur- 
ther instantiation  of  an  input  variable  from  the  caller;  in  addition,  neither  unification 
nor  guard  evaluation  failed. 

Consider  the  Parlog  clauses  for  partitioning  a  list  in  figure  19  and  compare  it  with  the 
corresponding  CP  solution  in  figure  10. 

Parlog83  was  also  influenced  by  the  functional  programming  methodology.  And-relations 
that  exhibit  a  functional  behavior,  i.e.  one  argument  is  uniquely  determined  by  the  values 
of  the  others,  can  be  defined  by  means  of  conditional  equations.  Conditional  equations  in 
Parlog83  assume  the  same  behavior  as  in  the  parallel  language  model  described  in  [HHT82], 
where  such  notation  was  originally  introduced.  They  allow  relations  to  be  written  in  a  more 
readable  functional  notation  when  nondeterminism  is  not  required.  Their  mode  of  evaluation 
can  be  eager  or  iazy.  Eager  evaluation  allows  nested  function  calls  to  proceed  as  an  AND- 
parallel  conjunction  of  relation  calls.  As  an  example  of  a  program  written  using  conditional 
equations,  the  t  calculation  example  is  shown  in  figure  20.  An  approximation  of  the  value  of 


11  Or-relation  calls  can,  in  fact,  appear  outside  a  set  constructor  as  another  way  of  breaching  the  mode 
constraints  as  illustrated  in  [CG83].  However,  only  one  solution  will  be  returned  from  such  a  call. 
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mode  partition(?,  ?,|,T)- 

partition(r,/7|/s/,/7|rs/,j/s)  :-  /  =<  x  |  partition(z,/s,rs,ys). 
partition(x,/?|/s/,rs,/?|ys/)  :-  /  >  x  |  partition(r,/s,a:s,ys). 
partition(r,  [],[],[]) 

?-  partition(r,/,//,/2) 


Figure  19:  List  partitioning  in  Parlog83 


function  pi(?) 

pi(n)  =  sigma(terms(integers(./, n)))/n 

function  integers(  ?,  f) 

integers(t,n)  =  /t|integers(t-fl,n)7    if  i=<  n  |. 

integers(«,n)  =  []    if  t  >  n  | 

function  terms(  ?,  ?) 

terms([i\is],n)  =  [4/(l+Ut)\terms(is,n)J     if  /  is  (i-0.5)/n. 

terms(//n)  =  [] 

function  sigma(?) 

8igma{[a, 12  \lsj)  =  s\gma{[tl+t2\add(ts)J). 
sigma{[tj)  =  i. 
sigma(/'7)  =  0 

function  add(  ?) 

add([tl,t2\ts])  =  [tl +i2\add(is)]. 
add([t])  =  [t]. 
add(/7)  =  [] 


Figure  20:  Calculation  of  n  using  conditional  equations  in  Parlog83. 
7r  can  be  obtained  by  the  sum 


1     n 


i-\ 


where    f(x)  = 


n  r-f     \    n     I  1  +  x' 


which  is  itself  an  approximation  of  the  definite  integral  /J  f(x)dx  using  the  rectangle  rule.  In 
figure  20,  function  integers  generates  a  list  of  integers  from  /  to  n,  terms  generates  a  list  of 

terms  /  (  -^  J  and  sigma  sums  them  up  using  a  balanced  addition  scheme  by  means  of  the 

pairwise  add  function. 

Another  feature  of  Parlog83  not  proposed  by  CP  was  the  sequential  conjunctive  operator 
"&".  Various  combinations  of  sequential  and  parallel  evaluation  can  provide  a  variety  of 
interesting  behaviors  for  the  same  set  of  clauses.  Such  an  operator  was  intentionally  left  out  in 
CP  as  its  effect  could  be  simulated  by  means  of  a  control  variable  and  the  read-only  annotation. 
The  goal  to  be  executed  first  owns  the  write-enable  version  of  the  control  variable,  whereas 
the  goal  to  be  executed  next  is  suspended  on  its  read-only  counterpart.    Upon  successful 


6.1     Transition  to  Parlog84  21 

termination  of  the  first  goal,  the  control  variable  is  instantiated,  activating  this  way  the  second 
goal.  The  same  behavior  can  also  be  derived  by  using  the  mode  constraints  of  Parlog83.  The 
control  variable  approach  is  equivalent  to  a  sequential  operator,  but  its  correct  placement  and 
instantiation  requires  careful  consideration  by  the  programmer.  On  the  other  hand,  as  Kusalik 
notes  [Kus84],  the  absence  of  a  sequential  operator  in  CP  has  often  prompted  incorrect  use  of 
the  commit  operator  to  achieve  sequentiality.  The  delay  in  propagating  variable  bindings  to 
the  rest  of  the  clause  makes  the  commit  operator  a  poor  choice  for  this  purpose.  The  execution 
of  such  programs  often  results  in  hard  to  detect  deadlocks.  Gregory  argues  that  the  availability 
of  a  sequential  operator  gives  the  programmer  a  tighter  control  over  the  granu/arity12  of  his 
program  [Gre87].  This  may  be  of  significant  importance  when  the  program  is  to  be  run  on 
architectures  of  largely  varying  grain  of  parallelism.  As  a  proof,  Parlog's  sequential  operator 
was  found  more  convenient  than  FCP's  more  elaborate  control  variable  mechanism,  when 
an  implementation  of  a  parallel  numerical  algorithm  was  attempted  in  both  languages,  as 
reported  in  [BLM086].  The  incorporation  of  a  sequential  operator  in  parallel  logic  languages, 
however,  is  still  debatable. 

Finally,  Parlog83  inherited  RL's  bounded  buffer  notation  —  although  the  zero  buffer 
case  was  disallowed  for  implementation  convenience.  The  language  also  included  a  function 
application  facility  similar  to  Lisp's  mapcar  primitive  and  program  manipulation  predicates 
equivalent  to  Prolog's  assert  and  retract. 

6.1      Transition  to  Parlog84 

Over  the  following  year,  Parlog  underwent  a  series  of  modifications  focusing  on  redundant 
feature  elimination  and  on  a  better  understanding  of  program  behavior.  The  result  was  a 
language  with  no  functional  subset,  no  annotations  and  an  entirely  different  way  of  viewing 
communication  constraints,  hereinafter  named  Parlog84  [CG84b].  Parlog84  kept  the  mode 
declarations  — although  only  one  mode  per  relation  definition  is  allowed —  and  renamed  and- 
and  or-relations  to  single  and  multiple  solution  predicates  respectively.  Moreover,  a  new 
sublanguage,  called  Kernel  Parlog,  was  introduced  in  which  all  source  Parlog  programs  are 
first  translated.  The  Kernel  Parlog  form  of  a  program  is  called  its  standard  form  and  can  be 
considered  as  a  version  of  the  program  in  which  mode  declarations  have  been  compiled  away. 
As  a  matter  of  fact,  Kernel  Parlog  is  modeless  and  all  arguments  in  the  head  are  distinct 
variables.  In  addition,  it  supports  the  following  restricted  unification  primitives: 

•  one-way  unification13  "<=":  can  only  instantiate  variables  in  its  left  argument;  if  it 
can  only  proceed  by  binding  variables  in  its  right  argument,  it  suspends. 

•  test  unification  "==":  attempts  to  perform  unification  of  its  arguments  without  in- 
stantiating them;  if  it  can  only  proceed  by  instantiating  one  of  its  arguments,  it  suspends. 

•  assignment  unification14  ":=":  expects  an  uninstantiated  variable  term  as  its  left 
argument  which  it  instantiates  to  its  right  argument;  an  error  occurs  if  its  left  argument 
is  not  a  variable. 

•  data:  suspends  until  its  argument  becomes  instantiated. 


12degree  of  parallelism. 

13also  referred  to  as  input  matching. 


14 


also  referred  to  as  output  matching. 
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•  var:    succeeds  if  at  the  time  of  call  its  argument  is  an  uninstantiated  variable;  fails 
otherwise. 

The  unification  primitives  above  try  to  express  the  operational  behavior  of  mode  decla- 
rations in  an  explicit  manner,  by  rewriting  unification  as  calls  in  the  guard.  These  primitives 
can  also  be  used  directly  by  Parlog  programs  sacrificing  declarativeness  for  the  sake  of  process 
behavior  understanding.  As  an  example  of  Kernel  Parlog,  the  original  and  the  standard  form 
of  a  predicate  performing  a  parallel  list  search  for  an  element  satisfying  a  certain  property, 
are  shown  in  figure  21. 

mode  try(?,?,T). 

try(x,[],NONE).  try(a,b,c)  +-[]<=  h :  c  :=  NONE. 

try(r, [y\ys],y)  «-  satisfy(y,r)  :.  try( a,b,c)  «—  [y\ys]  <=  b,  satisfy(y,a)  :  c  :=  y. 

try(x,[y\ys],z)  —  try(a,6,c)  «-  [y\ysj  <=  b, 

not(satisfyfy,zi),  try(x, ys,z)  :.  not(satisfyfj/,aj),  try(a,ys,2)  :  c  :=  z. 

Figure  21:  Parlog  and  Kernel  Parlog  version  of  a  parallel  list  search  predicate. 

The  concept  of  a  standard  form  was  also  suggested  by  Ueda  as  a  solution  to  a  partic- 
ular problem  of  CP  [Ued86a].  In  particular,  Ueda  noticed  that  some  CP  clauses,  although 
"logically"  equivalent,  exhibited  different  behavior.  He  further  argued  that  translating  such 
clauses  into  a  common  standard  form  and  allowing  unification  and  guard  evaluation  to  proceed 
in  parallel  would  considerably  simplify  program  understanding.  As  an  additional  advantage 
concerning  programming  style,  the  logical  flow  of  a  Kernel  Parlog  program  coincides  with  its 
textual  order:  first  come  input  matching  calls,  then  evaluation  of  guards,  then  output  matching 
calls  and  finally  execution  of  the  body.  CP  (and  hence  Parlog)  programs  have  been  criticized 
for  not  conforming  to  the  above  discipline  typical  of  imperative  style  parallel  programs  [Gel84]. 

From  the  implementation  point  of  view,  Parlog84  presented  a  major  breakthrough  over 
CP.  In  particular,  it  did  away  with  the  elaborate  multiple  environment  mechanism,  by  imposing 
a  simple  compile-time  checkable  constraint:  guard  safety.  A  guard  is  safe  if  it  does  not  bind 
variables  in  an  input  argument  in  the  head  of  the  clause  [CG85].  Guards  may  only  bind 
variables  local  in  the  guards  or  body,  or  variables  in  the  output  arguments  of  the  head. 
Intuitively,  a  guard  is  unsafe  whenever  it  tries  to  communicate  bindings  to  the  call  before 
commitment  to  that  clause.  Example  of  an  unsafe  guard  is  the  call  uy  is  x+T  in  the  guard 
of  the  clause 

mode  increment(?). 

increment([lncremeni(x,y)\restj)  <—  y  is  x+1  :  increment(res<). 

since  it  binds  variable  y  appearing  in  a  weak  input  argument  of  the  increment  clause.  Al- 
though a  conservative  data-flow  algorithm  for  guard  safety  is  employed  for  Parlog84,  rejected 
programs  will  in  most  cases  be  unsafe.  All  accepted  programs,  however,  are  guaranteed  to  be 
safe. 

Along  with  the  conjunctive  sequential  operator  inherited  from  its  immediate  ancestor, 
Parlog84  felt  the  need  to  introduce  a  disjunctive  sequential  operator  ";"  as  well.  The  use  of  such 
an  operator  allows  certain  clauses  to  be  considered  only  if  a  group  of  other  alternative  clauses 
proves  non-candidate.   The  original  description  of  CP  did  not  provide  for  such  an  operator, 
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although  a  restricted  version  of  it  was  introduced  some  time  later,  when  it  appeared  useful 
in  the  context  of  a  particular  CP  programming  paradigm  [ST83].  A  built-in  predicate,  called 
otherwise,  could  appear  in  the  guard  of  the  last  of  a  group  of  alternative  clauses,  delaying 
consideration  of  that  clause  until  after  all  other  clauses  failed  to  commit.  Subsequently,  Ueda 
suggested  a  generalization  of  this  scheme  [Ued86a],  allowing  the  predicate  to  occur  in  any 
guard,  which  achieves  the  same  effect  as  Parlog's  ";"  operator. 

6.2      Metaprogramming  in  Parlog84 

All  the  discussion  about  meta-interpreters  for  CP  applies  equally  well  for  Parlog  too.  Thus,  the 
Parlog  programmer  that  needs  some  added  functionality  in  his  programs  can  write  an  enhanced 
Parlog  meta-interpreter  and  rely  on  the  partial  evaluation  capabilities  of  the  underlying  system 
for  an  efficient  execution.  Experiments  have  shown,  however,  that  even  partially  evaluated 
programs  may  experience  an  undesirable  slowdown  compared  to  the  programs  without  the 
added  functionality.  What  seems  to  be  worse  is  that  depending  on  the  type  of  functionality 
required,  it  may  be  hard  or  even  impossible  to  express  it  in  terms  of  a  meta-interpreter.  For 
these  reasons,  Parlog84  took  a  different  approach  to  metaprogramming.  The  language  defined 
two  forms  of  metaprogramming  predicates:  the  simple  and  the  three-argument  metacall.  The 
simple  metacall  takes  the  form 

call(  conjunction?) 

suspends  until  conjunction  is  instantiated  to  a  conjunction  of  Parlog  goals  and  then  evaluates 
it.  The  behavior  of  call  is  identical  to  that  of  conjunction.  Figure  22  shows  how  the  metacall 

mode  shell(?). 

&hell([fg(cmd)\cmds])  *—  call( cmd)  k  shell(cmds). 

she\l([bg(cmd)\cmdsj)  «—  call( cmd),  shell( cmds). 

6hell(//). 

Figure  22:  A  Parlog  shell  with  foreground  and  background  commands. 

predicate  can  be  useful  in  building  a  simple  operating  system  shell  that  executes  foreground 
and  background  Parlog  commands.  This  example,  appearing  in  [CG84a],  is  the  translation 
of  the  corresponding  CP  program  given  by  Shapiro  in  [Sha86b].  However,  the  CP  version 
places  the  metacall  predicate  in  the  guard  to  achieve  the  sequentiality  required  by  foreground 
commands.  This,  however,  has  the  undesirable  effect  of  delaying  any  bindings  made  by  a 
foreground  command,  until  after  its  successful  completion.  Shapiro  discusses  a  remedy  to  this 
problem,  by  introducing  the  concept  of  write-early  variables.  As  their  name  suggests,  these 
variables  override  the  multiple  environment  mechanism  of  CP,  by  reporting  their  bindings  to 
the  global  environment  as  soon  as  they  are  generated.  Although  write-early  variables  will  allow 
guards  to  achieve  real  sequential  behavior,  they  are  entirely  inconsistent  with  the  language 
philosophy  and  can  be  source  of  accidental  errors  or  malicious  abuse. 

As  another  interesting  application  of  the  simple  metacall,  consider  the  clauses  in  figure  23 
implementing  the  negation  as  failure  rule  in  Parlog  [Gre87].  For  the  negation  rule  to  be 
correctly  expressed,  no  variables  should  be  bound  during  the  evaluation  of  the  negated  call. 
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mode  not(?). 

not(r)  <—  call(i)  :  fail; 

not(x). 

Figure  23:  Negation  as  failure  in  Parlog 

As  Gregory  notes,  what  is  interesting  about  this  program  is  that  guard  safety  implies  exactly 
the  correctness  of  the  negation  as  failure  rule. 

The  simple  call  is  by  itself  powerful;  yet,  it  appears  to  be  of  little  help  in  cases  where 
the  conjunction  of  goals  it  executes  fails.  We  would  rather  have  call  succeed  and  report 
such  a  failure.  In  addition,  in  an  operating  system  framework,  the  supervisor  may  want  to 
temporarily  suspend  the  execution  of  a  process  or  even  terminate  it.  The  needs  of  such  systems 
forced  Parlog  to  employ  an  even  more  powerful  metacall  primitive:  the  three-argument  call 
[CG84b].  It  takes  the  form 

call(  conjunction  ?, status] , control?) 

where  the  last  two  arguments  have  the  following  meaning:  if  conjunction  succeeds  or  fails, 
status  is  bound  to  SUCCEEDED  or  FAILED  respectively,  control,  which  is  an  uninstantiated 
variable,  can  be  bound  by  another  process  either  to  STOP  or  to  an  incremental  stream  of 
messages  [SUSPEND, CONTINUE,SUSPEND\xs]  which  may  end  with  a  STOP;  conjunction 
obeys  to  the  control  commands  and  status  is  bound  either  to  STOPPED  or  to  the  incremental 
stream  [SUSPEND, CONTINUE, SUSPEND\ys].  The  call  predicate  itself  always  succeeds.  As 
a  result,  even  if  conjunction  fails,  variable  bindings  generated  up  until  failure  are  preserved.  As 
an  example  of  the  versatility  of  this  form  of  the  call,  figure  24  demonstrates  the  flat  equivalent 
version  of  the  Kernel  Parlog  try  clauses  of  figure  21.    The  transformation  algorithm  used 

try(a,b,c)  *- 

cal\(simple-tTy(s,a,b),sl,cl), 

cM{([y\ys]<=b,s&tis{y(y,a)),s2,c2), 

call(  ([y\ys]<=  b,  not  ('satisfy  (y,a)J,  try  (a,  ys,  z)),sS,cS), 

ot{s1,s2,sS,c1,c2,cS), 

body-try(  s,s2,s3,  c,  y,  z) . 

simple-try(s,a,fc)  «-  []  <=  b  :  s  :=  SUCCEEDED. 

or(sJ,s2,sS,cl,c2,cS)  <-  SUCCEEDED  <=  si  :  c2  :=  STOP,  cS  :=  STOP. 
or\sl,s2,s3,cl,c2,c3)  —  SUCCEEDED  <=  s2  :  cl  :=  STOP,  cS  :=  STOP. 
ov\sl,s2,s3,cl,c2,c3)  «-  SUCCEEDED  <=  sS  :  cl  :=  STOP,  c2  :=  STOP. 

hody-tTy(s,s2,s3,c,y,z)  <-  SUCCEEDED  -^  s  :  c  :=  NONE. 
body-try(s,s2,s3,c,y,z)  <-  SUCCEEDED  <=  s2  :  c  :=  y. 
body-try( s,s2,sS,c,y,z)  «-  SUCCEEDED  <=  s3  :  c  :=  z. 

Figure  24:  Flat  Kernel  Parlog  version  of  try  predicate, 
was  suggested  by  Gregory  [Gre87],  in  cases  where  the  underlying  architecture  would  perform 
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better  in  the  absence  of  tree-structured  guard  computations. 

Recent  efforts  at  Imperial  College  to  build  a  versatile  Parlog  Programming  System  have 
suggested  the  adoption  of  an  even  more  powerful  metacall  primitive.  More  specifically,  Foster 
[Fos87]  has  proposed  a  five-argument  metacall  of  the  form 

call(  modu/e?,  resources  ?, goal?, status] , control?) 

where  module  specifies  the  code  fragment  in  which  goal  will  be  evaluated  and  resources  the 
resources  to  be  allocated.  Furthermore,  exceptions  are  introduced  into  the  language  in  the 
framework  of  the  metacall  predicate.  When  deadlock,  overflow  or  excessive  resource  allocation 
is  detected,  the  status  argument  is  bound  to  a  term  of  the  form 

exception(  DEA  DLOCK) 

If  on  the  other  hand,  a  relation  is  found  undefined  in  module,  a  term  of  the  form 

exception(  UNDEFINED, goal,newgoal) 

is  returned,  where  goal  is  the  goal  that  caused  the  exception.  An  exception  handler  can  now 
provide  a  newgoal  or  even  a  new  module  to  be  tried  alternatively. 

6.3     Lazy  Evaluation  and  Bounded  Buffers 

Whenever  a  producer  and  a  consumer  are  set  up  in  Parlog,  there  is  no  limit  to  the  number  of 
values  that  can  be  deposited  by  the  producer  before  a  value  is  received  by  the  consumer.  In 
other  words,  an  unbounded  buffering  scheme,  where  the  producer  can  run  arbitrarily  ahead  of 
its  consumer  is  the  default  handling.  However,  there  are  cases  where  the  producer  requires  a 
tighter  control,  if  resources  are  not  to  be  wasted.  We  shall  see  that  bounded  buffering  and,  as  a 
special  case  lazy  evaluation,  can  be  achieved  in  Parlog84  by  minor  modifications  to  equivalent 
eager  programs. 

Consider  as  an  example  the  eager  and  the  corresponding  lazy  (demand-dr/ven)  version  of 
a  Parlog  program  that  produces  an  infinite  stream  of  squares  of  positive  integers,  appearing 
in  figure  25.  Whereas  a  call  to  eager-enumerate  produces  an  infinite  list  of  squares,  a  call 
to  lazy-enumerate  requires  a  list  structure  before  it  can  assume  execution.  For  each  new 
variable  placed  in  this  list  structure,  a  new  square  is  calculated  which  binds  the  variable.  Now 
if  the  list  is  incrementally  instantiated  by  another  process,  that  process  is  in  effect  controlling 
the  rate  of  squares  generation.  Notice  that  the  differences  between  the  eager  and  lazy  versions 
of  the  squares  program  are  minor.  The  lazy  output  stream  is  transformed  into  an  incrementally 
generated  input  stream  of  variables,  each  one  of  which  triggers  the  calculation  of  one  more 
stream  value.  A  number  of  other  details  that  need  to  be  accounted  for  in  the  conversion  are 
further  described  by  Clark  and  Gregory  in  [CG84b]. 

Although  lazy  evaluation  is  a  practical  technique  for  slow  consumers,  it  might  be  too 
constraining  if  the  consumer  is  sometimes  slow,  but  sometimes  very  fast.  The  situation  calls 
for  use  of  bounded  buffers,  where  the  producer  can  run  ahead  of  the  consumer  within  a  fixed 
limit.  Recall  that  IC-Prolog,  RL  and  Parlog83  included  bounded  buffering  in  the  language. 
The  main  reason  for  their  elimination  in  Parlog84  was  that  they  introduced  redundancy.  They 
can  in  fact  be  implemented  by  the  clauses  in  figure  26.  Using  the  lazy  versions  of  the  integers 
and  squares  relations  from  figure  25,  we  can  obtain  a  fixed-size  buffered  communication  by 
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mode  integers (  ?,]).  mode  integers(  ?,  ?) . 

integers(:,/i|is/)  <—  integers(i,/t/|w/)  «— 

j  is  i+1,  integers(;,js).  il  :=  i,  j  is  i+1,  integers(>, is). 

integers(  i,[J).  integers(«,/7). 

mode  squares(?,t).  mode  squares(t,?). 

8quares(^|«s7,/j1;s7)  «-  squares(fi|is7,/)i>s7)  «- 

jis  i*i,  8quares(i5,;s).  ;  is  t*i,  squares(is,>s). 

8quares(/7,/7).  squares(/7,/7). 

mode  eager-enumerate(t)-  mode  lazy-enumerate(?). 

eager-enumerate(>s)  ♦-  lazy-enumerate(js)  «— 

integers(/,is),  squares(is,>s).  integers(/,w),  8quares(»s,;s). 

Figure  25:  Eager  and  lazy  enumeration  of  squares  of  integers  in  Parlog84. 

mode  create-buffer(?,t,f). 

create-bufTer(n,/_|t>ars/,resi)  «—  n  >  0  :  nl  is  n—1,  create-b\if[er(nl  ,vars, rest) . 

create-buffer(  0,  rest,  rest) . 

mode  buffer(?,T,?). 

buf[er([prod\prods],var,[req\reqsf)  <—  req  :=  prod,  var  :=  [J^vars],  butTer(prods,vars,reqs). 
buHer(.,var,[J)  '—var  :=  [J. 

Figure  26:  A  bounded  buffer  mechanism  for  Parlog84. 

the  following  enumeration  predicate  definition: 

mode  buff-enumerate(  ?,  ?). 
buff-enumerate(n,js)  «—  create-buffer(n,fcu/,<ai/), 

integers(/,6u/),  buffer  (buf,tail,reqs),  squ&r  es(reqs,js). 

Bounded  buffers  for  parallel  logic  languages  have  been  studied  by  Clark  and  Gregory 
[CG84b],  who  suggested  the  above  technique  and,  from  a  slightly  different  viewpoint,  by 
Takeuchi  and  Furukawa  [TF85]  for  CP. 

6.4     All  Solutions  Predicates 

The  multiple  solution  subset  of  Parlog84  corresponds  to  the  or-relation  subset  of  Parlog83. 
Although  the  form  of  the  set  constructor  changed  to 

set(  list], term?,  conjunction  ?) 

its  evaluation  rule  remained  the  same,  i.e.  it  potentially  exploits  OR-parallelism  and  restricted 
AND-parallelism15.  Moreover,  for  implementation  convenience,  Parlog  requires  that  all  global 
variables16  be  completely  instantiated  before  the  call  to  set  is  started.     Multiple  solution 


"parallelism  among  conjunctive  goals  that  do  not  share  uninstantiated  variables,  named  this  way  by  DeGroot. 
variables  shared  between  conjunction  and  the  rest  of  the  clause  in  which  the  set  expression  appears. 
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clauses  are,  as  in  Parlog83,  modeless,  which  suggests  that  they  can  be  used  in  any  direction. 
Extensive  details  on  the  behavior  of  set  were  deliberately  left  unspecified  to  provide  flexibility 
for  the  implementation.    The  "all  permutations"  clauses  in  figure  27  are  an  example  of  a 

perm([x\zs],[y\ysj)  *—  delete([x\xs],y,z)  &  perm(z,ys). 
perm(/ ],[]). 

delete(/r|xs/,r,zs). 
delete([x\xs],y,[x\ys])  <—  delete(rs,y,3/s). 

Figure  27:  Deriving  all  permutations  of  a  list  using  the  multiple  solution  subset  of  Parlog84. 

multiple  solution  relation.  The  call 

?-  set(list,perm,perm([l,2,3],perm)) 
extracts  all  solutions  by  binding  list  to 

list  =  [[l,2,3],[l,3,2],[2,l,3],[2,S,l],[3,l,2],[S,2,l]] 

in  some  arbitrary  order. 

In  addition  to  the  eager  set  expression,  Parlog84  defines  a  lazy  subset  constructor.   It 
takes  the  form 

&\ihset(list„of-vars?,  term?,  conjunction?) 

where  list.of.vars  is  a  stream  of  uninstantiated  variables  supplied  to  the  subset  expression 
either  directly  or  incrementally  by  another  process.  The  subset  expression  will  attempt 
to  bind  these  variables  with  instantiations  of  term  which  satisfy  conjunction,  in  the  order 
they  would  be  produced  by  chronological  backtracking.  Each  new  solution  is  generated  in 
a  demand-driven  manner,  as  soon  as  a  new  variable  appears  in  the  listjof.vars  stream.  If 
no  more  solutions  can  be  found,  all  subsequent  variables  are  bound  to  END.  This  lazy  set 
constructor  is  particularly  useful  in  applications  which  would  rather  not  have  the  solution 
generator  run  ahead  of  its  consumers.  For  example  the  call 

?-  subset([ll,  12, 13], pcrm,perm([l, 2, S],perm)) 

will  perform  the  variable  bindings 

11  =  [1,2,3],  12  =  [1,3,2],  13  =  [2,1,3] 


whereas  the  call 


performs 


?-  subset([ll,l2,lS],perm,perm([x,y],perm)). 
11  =  [x,y],  12  =  [y,x],  13  =  END. 


7     Guarded  Horn  Clauses 

CP,  but  especially  Parlog  and  its  notion  of  suspension,  inspired  yet  another  parallel  pro- 
gramming model.  The  language  was  named  Guarded  Horn  Clauses  (GHC)  [Ued85]  and  was 
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developed  by  Ueda  at  the  Institute  for  New  Generation  Computer  Technology  in  Japan.  As  its 
developer  recalls,  GHC's  design  was  neither  implementation-oriented  nor  application-oriented 
[Ued86b].  The  language,  rather,  followed  a  principle-oriented  design  philosophy  aiming  at 
simplicity,  expressiveness,  generality,  architecture  independence,  implementability  and  effi- 
ciency. Finally,  the  language  had  to  be  inherently  parallel  to  reflect  the  parallelism  present  in 
Horn  clauses.  Clearly,  this  proposal  was  motivated  by  idealism  more  than  any  preceding  one. 
Ueda  realized  that  Horn  clauses  themselves,  augmented  with  the  concept  of  the  guard  could 
probably  satisfy  all  of  the  above  requirements.  The  resulting  language  obtains  processes  and 
communication  in  the  same  way  as  CP  and  Parlog  do,  but  employs  neither  special  purpose 
annotations  nor  mode  declarations  as  a  means  of  synchronization.  Synchronization  can  be 
specified  by  assigning  appropriate  semantics  to  guards.  The  name  of  the  language  derives 
exactly  from  the  fact  that  guards  were  the  only  addition  to  pure  Horn  clauses  in  the  process 
of  deriving  a  parallel  programming  language. 

The  semantics  of  guards  in  GHC,  originally  presented  in  [Ued85]  and  further  clarified  in 
[Ued86b]  and  [Ued86c],  are  summarized  below: 

•  Rule  of  synchronization:  The  unification  and  the  evaluation  of  the  guard  of  a  clause 
suspend  if  they  may  only  proceed  by  performing  bindings  visible  to  the  caller  of  that 
clause. 

•  Rule  of  sequencing:  The  evaluation  of  the  body  of  a  clause  suspends,  if  it  may  only 
proceed  by  performing  bindings  visible  to  the  guard  or  the  caller  of  the  clause,  before 
commitment  to  that  clause. 

•  Rule  of  commitment:  A  single  clause  whose  unification  and  guards  complete  successfully 
is  indivisibly  selected  to  solve  the  call. 

The  first  two  rules,  also  called  suspension  rules,  imply  that  unification,  guard  evaluation 
and  body  execution  may  all  proceed  internally  and  among  themselves  in  parallel  and  yet 
there  is  no  need  for  multiple  environment:  whenever  multiple  values  would  be  assigned  to 
a  single  variable,  evaluation  suspends.  The  evaluation  may  resume  only  when  some  other 
process  instantiates  the  variable  which  caused  suspension.  Figure  28  shows  the  clauses  for 
list  partitioning  in  GHC.  Note  that  output  arguments  are  specified  in  the  body  by  using  the 

paxtition(X,[L\Ls],Ll,Ys)  :- L  =<  X  |  paxtition(X, Ls,Xs,Ys),  LI  =  [L\Xs]. 
partition^*, [L\Ls],Xs,L2)  :-  L  >  X  |  partition(X, Ls,Xs,Ys),  L8  =  [L\Ys]. 
partition(A', [],L1,L2)  :-  true  |  Ll  =  [],  LS  =  []. 

:-  partition(A',I,I/,I2). 

Figure  28:  List  partitioning  in  GHC. 

predefined  unification  "="  primitive.  In  this  respect,  GHC  resembles  Kernel  Parlog  and  in 
fact  GHC  avoids  mode  declarations  in  a  similar,  yet  slightly  more  elegant  fashion.  However, 
as  in  Parlog,  GHC  programs  are  directional  and  communication  constraints  are  expressed  at 
a  procedure  level.  As  another  example  of  GHC,  consider  a  demand-driven  Fibonacci  number 
generator  illustrated  in  figure  29  [Ued86d]. 


7.1     Flat  GHC  29 


fibonacci^s)  :-  true  |  fib(  0,l,Fs). 

Hb(Fl,F2,[F3\FsJ)  :-  true  |  F3  :=  F1+F2,  Rb(F2,FS,Fs). 

fib(__//)- 

Figure  29:  Demand-driven  Fibonacci  number  generation  in  GHC. 

Like  CP  and  unlike  Parlog,  GHC  defines  no  sequential  operators.  Ueda  views  sequentially 
merely  as  an  optimization  phase  of  a  parallel  program  [Ued86b].  A  compiler  may  translate 
parallel  mode  of  execution  into  coroutining  or  even  sequential  execution,  if  the  underlying 
architecture  performs  better  for  the  specific  fragment  of  code.  A  requirement  remains  though, 
that  such  a  transformation  preserve  the  semantics  of  the  original  parallel  program  with  respect 
to  success,  suspension  and  failure.  Such  compiler  optimizations  are  made  possible  only  if  the 
underlying  language  is  kept  simple  and  this  is  in  particular  true  of  GHC. 

GHC  avoided  the  incorporation  of  other  impure  features  such  as  the  variable  predicate 
for  checking  whether  a  variable  is  uninstantiated,  the  metacall  predicate  and  so  on,  not  because 
they  were  felt  unnecessary  but  because  of  the  problems  associated  with  them.  In  particular, 
Ueda  [Ued85]  points  out  that  a  call  of  the  form 

ca\\(X=0,S,C),  X  =  1 

is  order  of  evaluation  dependent,  since  it  can  either  succeed  or  fail  depending  on  which  call  is 
evaluated  first.  Foster  [Fos87]  attributes  the  error  in  the  above  call  to  the  fact  that  a  variable 
referenced  by  a  metaprogram  (A'  in  our  case)  is  passed  to  the  object  program  (X  =  1).  He 
proceeds  by  defining  restrictions  that  should  hold  between  these  two  distinct  levels  of  programs 
in  order  to  eliminate  inconsistencies  such  as  this  above. 

GHC  is  semantically  quite  close  to  Parlog.  GHC's  first  rule  of  suspension  embodies 
both  Parlog's  synchronization  mechanism  and  safety  check.  Whereas  Parlog  programs  with 
unsafe  guards  are  considered  erroneous  and  are  discarded  at  compile-time,  corresponding  GHC 
programs  are  correct  and  simply  suspend.  In  this  way,  no  semantic  analysis  is  required  for 
the  language:  GHC  programs  are  correct  if  and  only  if  they  are  syntactically  correct.  It  is 
the  rule  of  synchronization,  however,  that  poses  a  severe  practical  limitation  on  the  language. 
Performing  a  run-time  test  each  time  a  variable  is  to  be  bound  proves  prohibitively  expensive 
in  any  of  today's  architectures  and  it  has  been  a  major  disappointment  with  GHC. 

7.1      Flat  GHC 

As  in  the  case  of  CP,  a  flat  variant  of  GHC,  called  FGHC,  was  derived  to  cope  with  the 
implementation  difficulties  of  its  parent  language.  If  guards  in  a  GHC  program  are  from  within 
a  restricted  set  of  simple  predicates  and  the  evaluation  of  the  body  is  not  started  until  after 
commitment  to  a  clause,  it  is  possible  to  derive  an  entirely  static  analysis  of  unification  similar 
to  that  of  Parlog.  FGHC  can  therefore  compile  away  GHC's  run-time  overhead.  In  fact,  Ueda 
suggests  that  such  optimization  be  employed  even  in  full  GHC  for  those  programs  whose  Parlog 
equivalent  have  safe  guards  [Ued86b].  This  implementation  difficulty  of  GHC  was  also  realized 
by  Hirata  [Hir86],  whose  language,  called  Oc,  is  a  tight  subset  of  GHC  designed  according  to 
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the  principle  "deep  guard  considered  harmful".  As  Takeuchi  and  Furukawa  observe  [TF86b], 
FGHC,  Oc  and  Kernel  Parlog  with  flat  guards  are  all  equivalent  and  constitute  the  simplest 
parallel  logic  programming  language.  FGHC  has  attracted  a  lot  of  research  interest  recently 
as  it  was  selected  as  the  main  vehicle  for  the  development  of  the  parallel  inference  engine  of 
the  Japanese  Fifth  Generation  Computer  Project. 

7.2     Influence  on  Parlog 

A  number  of  changes  to  Parlog84  have  been  been  reported  recently,  mainly  as  a  result  of  the 
tempting  simplicity  and  elegance  of  GHC.  The  mode  declarations  are  eliminated,  since  their 
role  was  anyway  purely  syntactic.  As  a  consequence,  output  arguments  should  be  instantiated 
using  explicit  unification  calls  in  the  body  of  the  clause.  In  addition,  assignment  unification 
was  replaced  by  full  unification.  This  change  was  motivated  for  three  reasons: 

•  Multiple  producers  may  now  be  supported;  their  produced  values  have  to  be  unifiable. 

•  Multi-moded  predicate  definitions  may  be  obtained. 

•  There  is  no  efficiency  loss,  provided  that  the  compiler  can  resolve  calls  with  variable 
left-hand  arguments  to  simple  assignment  unification  calls. 

Observe,  however,  that  the  addition  of  full  unification  does  not  add  any  power  to  the 
language,  as  its  effect  could  be  obtained  by  means  of  two  one-way  unifications  in  parallel. 
Finally,  the  syntax  of  the  language  was  also  slightly  modified  to  come  closer  to  that  of  the 
widely  accepted  Edinburgh  Prolog.  The  new  form  of  the  language,  called  Parlog86,  is  described 
to  some  extent  in  a  recent  report  by  Ringwood  [Rin88].  The  Parlog86  version  of  the  list 
partitioning  program  is  identical  to  the  GHC  solution  of  figure  28. 
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Another  recent  proposal  that  was  initiated  mostly  by  a  quest  for  expressiveness  rather  than 
by  practicality  is  P-Prolog,  developed  by  Yang  and  Aiso  in  Japan  [YA86].  In  an  attempt  to 
obtain  more  Horn-clause  oriented  parallel  logic  programs,  P-Prolog's  design  objectives  were 
to  support  a  synchronization  mechanism  which  could  dynamically  determine  the  mode  of  use 
of  a  program  and  to  express  multiple  solution  programs  more  naturally.  A  mechanism  that 
appeared  to  be  suitable  for  both  goals  of  the  language  was  the  notion  of  exclusive  guarded 
Horn  clauses.  A  set  of  alternative  guarded  clauses  is  said  to  be  exciusive  for  a  given  call,  if 
only  one  clause  can  commit  for  that  call. 

P-Prolog  is  divided  into  a  single-neck  clause  subset  and  a  double-neck  clause  subset. 
Single-neck  clauses  are  equivalent  in  power  to  Parlog's  single  solution  clauses,  yet  allow  for 
potentially  improved  expressiveness.  The  communication  constraint  is  based  on  the  exclusive 
clauses  idea.  A  call  G  to  a  set  of  single-neck  clauses  commits  to  a  clause  C,  if  the  clauses  are 
exclusive  for  G  and  only  C  can  commit.  If  the  clauses  are  not  exclusive,  execution  suspends 
awaiting  further  instantiation  of  variables  in  G.  This  happens  because  the  language  regards 
single-neck  clauses  as  expected  exclusive.  As  a  consequence,  P-Prolog  eliminates  the  need  for 
mode  declarations,  annotations  or  even  explicit  unifications  in  the  body.    An  example  of  a 
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P-Prolog  program  is  shown  in  figure  30,  which  implements  the  list  partitioning  problem  for 
comparison  with  CP,  Parlog  and  GHC.  Although  elegant,  the  exclusive  clauses  principle  may 


p&rtition(X, [L\Ls],[L\Xs],Ys)  .-  L  =<  X  :  partition(A,.£s,As,ys). 
Vaxtition(X,[L\Ls],Xs,[L\Ysj)  :-  L  >  X  :  partition(A,Zs,As,  Ys). 
partition(A',/7,/7,/7). 


?-  partition(AM,i;,i;2). 


Figure  30:  List  partitioning  in  P-Prolog. 

cause  problems  in  cases  where  two  or  more  clauses  are  candidates  for  commitment,  but  any 
one  of  them  can  be  chosen  in  a  nondeterministic  fashion.  A  clear  example  of  such  a  predicate  is 
the  merge  relation  for  streams  presented  in  figure  13.  The  definition  of  the  merge  predicate 
in  P-Prolog  is  given  in  figure  31  borrowed  from  [YA86].  The  program  makes  use  of  an  other 

(1)  merge([X\Xs],Ys,[X\Zs])  :-  merge(Ys,Xs,Zs). 

(2)  merge(/7,Ys,y5). 

(3)  merge{X, [Y\Ys],[Y\Zs])  :-  other  :  merge(Ys,Xs,Zs). 

(4)  merge(A's,/7,As). 

Figure  31:  Stream  merging  in  P-Prolog. 

predicate  which  may  appear  in  a  guard  and  is  similar  to  Parlog's  sequential  operator  A;  B  but 
allows  computation  to  proceed  to  B  even  if  A  is  just  suspended.  Interestingly  enough,  the 
above  definition  of  merge  predicate  behaves  in  a  quite  similar  way  to  the  fair  merge  operator 
described  in  [SM84].  Although  that  version  does  not  make  explicit  use  of  an  operator  like 
other,  its  use  is  implicit  in  the  requirement  that  the  underlying  implementation  is  stable.  An 
implementation  is  characterized  as  stable,  if  when  faced  with  two  or  more  candidate  clauses, 
the  textually  first  such  clause  is  chosen  to  commit.  The  above  program  obtains  stability  as 
follows:  in  case  only  the  first  or  both  input  streams  are  instantiated,  the  textually  first  clause 
commits;  if  only  the  second  stream  is  instantiated  then  clauses  (1)  and  (2)  are  not  exclusive 
and  other  switches  control  to  clauses  (3)  and  (4).  A  slight  difference  with  the  fair  merger, 
however,  appears  in  this  last  case:  the  behavior  of  other  is  not  revocable  to  allow  commitment 
to  clauses  (1)  or  (2),  even  if  the  first  stream  is  instantiated  at  a  later  time.  In  cases  where  a 
fair  merge  is  not  required  (e.g.  when  streams  are  produced  at  a  very  slow  rate),  the  P-Prolog 
solution  may  impose  a  substantial  unnecessary  synchronization  delay. 

This  novel  way  of  expressing  communication  constraints  possesses  the  interesting  property 
that  the  direction  of  data  flow  is  determined  at  run-time.  P-Prolog  programs  can  thus  be 
multidirectional  as  in  Prolog.  As  with  actual  Prolog  programs,  though,  the  applicability  of 
this  unique  feature  of  logic  programs  is  limited.  First,  programs  often  require  rewriting  to  work 
in  both  directions,  which  may  not  always  be  trivial.  System  predicates  should  also  work  in  all 
modes  which  may  be  particularly  tedious  or  difficult  to  define  (e.g.  in  arithmetic  predicates). 
Finally,  the  definition  of  a  problem  through  that  of  its  dual  problem17  is  not  always  as  naturally 


rthe  problem  obtained  by  inverting  the  direction  of  data  flow. 
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or  as  efficiently  expressed  as  when  specified  as  a  problem  by  itself.  Consider  for  example  the 
definition  of  an  all-permutations  relation  as  that  of  quicksort  with  reverse  mode  of  use. 

The  double-neck  subset  of  P-Prolog  corresponds  to  Parlog's  multiple  solution  subset, 
but  is  potentially  more  powerful  and  combines  more  elegantly  with  the  rest  of  the  language. 
Double-neck  clauses  are  not  expected  exclusive.  If  two  or  more  clauses  appear  to  be  candidates 
for  commitment  they  are  all  pursued  in  parallel.  The  commit  operator  can  be  regarded  as  a 
simple  sequential  conjunctive  operator.  Consider  as  an  example  the  double-neck  definition  for 
a  parallel  member  relation,  used  in  a  mode  which  lists  all  members  of  its  second  argument 
list.  The  definition  is  shown  in  figure  32  as  it  appears  in  [YA86].  Notice  the  importance  of  the 

mexnber{X,[X\Ys])  :—  Ys\==  []:  true. 
member(X,/_|ys7)  :--  Ys\==[]:  memberf X,Ys). 
member(  X,  [X]) . 

Figure  32:  Or-parallel  list  enumeration  in  P-Prolog. 

guards  in  specifying  that  the  second  argument  Ys  is  really  a  consumer  of  values  produced  by 
some  other  process.  Built-in  calls,  such  as  "\==",  suspend  in  P-Prolog  so  they  can  be  used 
for  synchronization  purposes  as  illustrated  by  the  above  example.  By  issuing,  a  call  of  the 
form 

member(A',Ii),  tnember(X,L2) 

we  can  locate  an  element  in  the  intersection  of  lists  Ll  and  L2.  As  Yang  and  Aiso  note, 
programming  in  P-Prolog  requires  attention  to  the  exclusive  relation  between  clauses  rather 
than  to  the  exact  specification  of  the  I/O  pattern.  Although  this  observation  is  generally  true 
for  single-neck  programs,  the  exclusive  relation  appears  to  be  an  inadequate  mechanism  when 
extended  to  double-neck  clauses.  Additional  guards,  with  no  declarative  meaning,  have  to 
be  carefully  included  by  the  programmer  to  complement  the  synchronization  mechanism.  In 
addition,  such  guards  often  specify  the  I/O  pattern  explicitly. 

No  serious  attempt  has  been  made  towards  a  real  implementation  of  P-Prolog  but  many 
deficiencies  are  anticipated;  in  fact,  this  seems  to  be  the  language's  weakest  point.  The  syn- 
chronization scheme  of  P-Prolog  is  considerably  more  elaborate  than  that  of  other  committed- 
choice  languages:  to  test  whether  a  set  of  clauses  is  exclusive  is  a  much  stronger  decision  than 
testing  whether  unification  with  any  of  them  can  succeed.  Even  less  promising  appears  to 
be  a  reasonably  efficient  realization  of  the  double-neck  subset.  Experience  from  other  parallel 
logic  languages  suggests  that  direct  support  of  multiple  solution  finding  implies  an  unavoidable 
overhead. 
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The  last  language  to  be  described  in  this  report  was  developed  in  the  European  Computer- 
industry  Research  Center  (ECRC)  by  Ratcliffe,  Robert  and  Syre  [RR86,RS87]  and  is  the  main 
language  for  the  PEPSys  (Parallel  ECRC  Prolog  System).  This  language  is  very  different  from 
the  proposals  presented  so  far,  since  it  does  not  derive  from  RL.  In  fact,  it  does  not  employ 
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guards  or  committed-choice  nondeterminism  since  synchronization  is  not  part  of  the  language. 
It  is  a  language  that  can  potentially  exploit  parallelism  available  on  the  underlying  machine 
rather  than  a  language  which  can  describe  concurrent  activities.  In  this  respect,  this  proposal 
is  really  part  of  the  "executing  Prolog  in  parallel"  effort,  but  is  included  here  to  provoke  some 
discussion  on  this  alternative  of  deriving  parallelism. 

This  approach  so  far  has  been  restricted  to  analyzing  Prolog  programs  in  an  attempt 
to  detect  goal  independence  i.e.  goals  sharing  no  uninstantiated  shared  variables18.  Clearly 
such  goals  can  execute  in  parallel.  Although  some  run-time  decision  is  necessary  before  goal 
independence  can  be  testified,  methods  were  developed  that  perform  an  extensive  compile- 
time  analysis  which  leaves  only  a  minimal  overhead  during  execution.  An  example  of  such 
kind  of  analysis  is  presented  by  Chang,  Despain  and  DeGroot  [CDD85],  where  the  user  need 
only  specify  the  mode  of  use  of  the  top-level  goal.  Tung  and  Moldovan  [TM86]  pursued  this 
approach  even  further  and  incorporated  an  analysis  which  infers  "reasonable"  mode  declara- 
tions as  well.  These  methods  often  trade  attained  parallelism  with  compiler  complexity.  The 
more  the  number  of  the  execution  time  decisions  concerning  data  dependencies,  the  higher  the 
run-time  overhead  and  the  larger  the  potential  for  obtaining  more  parallelism;  alternatively, 
the  more  complicated  the  static  analysis,  the  lower  the  execution  overhead  but  the  resulting 
parallelism  tends  to  be  more  restricted. 

PEPSys  Prolog  was  the  first  serious  effort  to  include  the  above  method  of  extracting 
parallelism  under  the  frame  of  a  new  language.  The  language  is  divided  into  a  regular  Prolog 
subset  and  into  a  parallel  subset.  The  parallel  subset  employs  restricted  AND-parallelism  but 
it  may  also  exploit  OR-parallelism  in  the  process  of  deriving  multiple  solutions  for  a  goal.  The 
design  objectives  for  this  language  are  quite  practically  oriented: 


Obtaining  parallelism  without  departing  from  the  usual  Prolog  semantics, 
Provide  sufficient  control  to  the  user  to  judge  when  parallelism  is  desired. 
Include  sequential  Prolog  as  a  subset  of  the  language, 
Preserve  multiple  solution  finding  even  for  the  parallel  subset  of  the  language. 


The  user  in  PEPSys  Prolog  may  specify  a  parallel  execution  of  alternative  clauses,  a  lazy  mode 
of  evaluation,  an  exhaustive  search  or  even  a  single  solution  computation.  In  addition,  the 
user  can  convey  information  to  the  system  as  to  whether  the  clause  ordering  is  significant  or 
not.  Depending  on  its  saturation  level  at  the  time  of  execution,  the  system  may  or  may  not 
grant  the  user's  request  for  either  AND-  or  OR-parallelism.  This  additional  control  which  is 
offered  to  the  user  is  a  distinctive  feature  of  the  approach;  however,  this  might  turn  out  to  be 
a  burden  rather  than  an  advantage  when  developing  large  programs. 

Finally,  the  language  provides  a  simple  modularization  facility  (similar  to  Modula-2), 
which  encapsulates  serial  or  parallel  modules.  Whereas  access  from  a  serial  to  a  parallel 
module  is  not  permitted,  the  opposite  interface  is  provided  by  built-in  predicates  such  as 
bagof,  oneof  and  firstof  with  the  obvious  meaning. 

The  features  of  the  language  are  clarified  by  means  of  the  quicksort  example  in  a  slightly 
modified  version  of  that  described  by  Ratcliffe  and  Robert  [RR86].  Figure  33  shows  the  code 
for  the  two  modules  interfacing  by  means  of  the  oneof  predicate,  since  only  one  solution 
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Serial  Module 

?-  import_from(  'qsort.par', [qsort/2j) . 

?-  oneof[qsort(List,SList,[])),  write(SZts/). 

Parallel  Module 

?- export(f qsort/2j). 

-  properties([clauses(unonfere<f),  execution(/azy)]). 

qsoTt([X\Xs],SO,SS)  :-  partition  (X,Xs,Ll,L2), 

qsort(Ll,SO,[X\SlJ)  #  qsort(L2,S2,SS),  SI  =  S2. 
qsort([],S,S). 

-  properties([solution(one),  c\auses(ordered),  execution(/<uy)]). 

partition(X, [L\Ls],[L\Xs],Ys)  :- L  =<  X,  partition^Is.As.Vs). 
partition^, [L\Ls],Xs],[L\Ys])  :-  partition(X, Ls,Xs,Ys). 
partition(JV,/ ],[],[]). 

Figure  33:  A  serial  and  a  parallel  module  defining  quicksort  in  PEPSys  Prolog. 

will  be  returned,  anyway.  The  lazy  execution  property  discourages  OR-parallel  execution  of 
alternative  clauses  whereas  the  unordered  clauses  property  of  partition  sequentializes  the 
order  of  their  consideration.  The  "#"  operator  urges  the  system  to  execute  the  two  qsort 
calls  in  parallel.  Note  the  distinct  variables  Si  and  S2  assigned  to  these  calls  stressing  their 
independence.  Their  unification  is  delayed  until  after  the  parallel  calls.  Parallelism  cannot  be 
exploited  if  Si  and  S2  are  expressed  as  a  common  variable  inside  the  two  qsort  calls,  a  rather 
serious  weakness  of  the  language. 

Overall,  this  direction  of  parallelization  may  prove  satisfactory  for  current  state-of-the- 
art,  coarse-gTain  architectures.  As  technology  improves,  though,  and  machines  become  capable 
of  exploiting  finer  grains  of  parallelism,  the  restricted-AND  parallebsm  approach  may  prove 
insufficient  and  languages  extracting  all  possible  parallelism,  via  synchronization  and  incre- 
mental communication  and  may  prove  far  superior. 


10      Object  Oriented  Programming 

Object  oriented  programming  proposes  an  alternative  to  applicative  programming  in  the  effort 
to  encompass  the  problems  associated  with  imperative  programming  style.  The  object  oriented 
paradigm,  which  has  motivated  the  proposal  of  several  languages  so  far,  employs  a  novel  unified 
view  of  data  and  programs.  Objects,  which  constitute  the  elementary  structures  in  a  program, 
share  characteristics  of  both  data  structures  and  procedures  operating  on  them.  Objects 
communicate  by  means  of  messages  only.  Communication  channels  are  setup  for  this  purpose 
during  object  creation  time.  Computation  in  an  object  oriented  framework  can  be  viewed  as 
a  cooperation  of  autonomous  communicating  units.  Classes  with  common  properties  can  be 
abstracted  in  class  objects.  A  hierarchical  structure  of  objects,  ranging  from  more  generalized 
to  more  specialized,  can  be  created  this  way.  Objects  automatically  inherit  all  properties  of 
their  ancestor  class  objects. 
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Parallel  logic  programming  languages  have  been  shown  to  lend  themselves  quite  naturally 
to  the  object  oriented  programming  paradigm.  This  direction  was  first  explored  by  Shapiro 
and  Takeuchi  [ST83]  soon  after  the  development  of  CP.  In  a  similar  fashion  to  the  process 
interpretation  of  logic  programming,  we  can  define  object  creation  and  termination,  inter- 
object  communication,  networks  of  communicating  objects,  object  state  and  so  on.  In  addition, 
the  inheritance  mechanism  can  easily  be  obtained;  an  example  is  given  below.  Combined  with 
the  "incomplete  message"  feature  of  logic  programs  for  an  implicit  two-way  communication, 
very  expressive  object  oriented  programs  can  be  derived. 

In  what  follows,  we  present  an  example  of  a  stack  class  capable  of  generating  and  manip- 
ulating an  arbitrary  number  of  stack  objects.  An  object  0  willing  to  allocate  a  stack  object, 
sends  the  appropriate  message  to  the  stack  class.  A  new  stack  object  is  generated  and  assigned 
a  channel  for  receiving  requests  from  other  objects.  This  channel  is  subsequently  returned  to 
object  0,  which  can  subsequently  communicate  with  the  stack  directly.  In  addition  to  this 
incoming  channel,  a  channel  between  the  object  and  the  parent  class  is  also  established.  This 
enables  the  stack  object  to  send  to  the  parent  class  messages  whose  methods  are  not  known 
or  even  to  inform  the  parent  upon  its  termination.  The  idea  of  propagating  messages  that 
cannot  be  handled  by  an  object  one  level  higher  in  the  object  hierarchy  is  known  as  delegation 
and  is  a  mechanism  for  implementing  inheritance  in  a  number  of  object  oriented  languages. 

While  the  stack  class  can  be  asked  to  generate  a  stack  or  to  specify  how  many  stack 
objects  are  currently  active,  the  stack  object  itself  deals  with  the  stack  operations  per  se.  The 
code  for  the  stack  object  in  Parlog86  is  shown  in  figure  34.   The  three  parameters  represent 

stack([push(X)\Cmds],S,OutCmds)  «-  stack( Cmd$,[X\S],OuiCmds). 

stack([pop(X)\Cmds],[Y]S],OuiCmds)  «-  X  =  Y,  stack(Cmds,S, OutCmds). 

stack([empty\Cmds],S, OutCmds)  «—  stack(Cmds,[], OutCmds). 

&tack([kill],S, OutCmds)  «-  OutCmds  =  [kill]. 

stack([Cmd\Cmds],S, OutCmds)  «-  stack(Cmds,S,OuiCmdsl),  OutCmds  =  [Cmd\OuiCmdsl]. 

Figure  34:  A  stack  object  in  Parlog86. 

the  incoming  message  stream,  the  state  (actual  stack  of  elements)  and  the  outgoing  stream 
leading  to  the  parent  class.  The  first  three  clauses  each  define  a  method  for  a  specific  message 
type:  push,  pop  and  empty  respectively.  The  fourth  clause  terminates  the  stack  object  by  a 
kill  message  in  its  incoming  stream.  The  kill  message  is  propagated  to  the  outgoing  channel  to 
notify  the  parent  class.  Finally,  the  last  clause  helps  redirect  class  messages,  whose  methods 
are  not  known,  to  the  parent  class. 

Figure  35  illustrates  the  clauses  for  the  stack  class.  The  first  clause  creates  a  new  stack 
object,  hands  its  incoming  channel  back  to  the  owner  and  its  outgoing  channel  to  the  in- 
coming channel  of  itself.  Pictorially,  the  stack  class  with  a  number  of  generated  objects  can 
be  represented  as  in  figure  36.  The  second  and  third  clauses  report  the  number  of  active 
stacks  and  update  the  state  (number  of  stacks)  when  a  stack  is  no  longer  needed.  At  this 
point,  it  should  be  clear  that  the  state  of  the  class  depends  on  the  needs  of  the  particular 
methods.  If,  for  example,  the  stack  class  had  been  required  to  occasionally  send  a  broadcast 
message  to  all  active  stacks  (such  as  push  an  element),  a  list  of  all  stacks  would  have  been 
maintained  in  the  state  as  well.  Finally,  the  last  clause  propagates  unknown  messages  to  its 
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stack-class([creaie(Stack)\Cmds],N,OutCmds)  «-  stack{Stack,[),BackCmds), 

merge(Cmds,BackCmds,AUCmds),  Nl  is  N+l,  stack-c\ass{AUCmds,Nl,OuiCmds). 

stack-c\ass(Piotv-many(X)\Cmds],N,OutCmds)  *—  X  =  N,  stack-c\ass(Cmds,N,OutCmds). 

stack-c\ass([ktll\Cmds],N,OutCmds)  «-  Nl  is  N-l,  stack-class( Cmrfs ,Nl,OuiCmds). 

stack-c\ass{[Cmd\Cmds],N,OutCmds)  — 

stack-class( Cmds,N,OutCmdsl),  OutCmds  =  [Cmd\OutCmdslJ. 


Figure  35:  A  stack  class  in  Parlog86. 


Figure  36:  A  network  of  a  stack  class  and  various  stack  objects. 

parent  class.  If  there  is  no  parent  class,  an  error  may  be  reported  instead.  An  object  that 
wishes  to  allocate  a  stack  may  send  a  message  [create (MyStack)]  to  stack-class  and  subse- 
quently bind  stream  MyStack  to  requests  concerning  the  actual  stack  or  its  parent  class,  such 
as  [push(l),push(2),how-many(N),pop(X)\Reqs].  The  stack  requests  will  be  handled  by  the 
allocated  stack  object,  whereas  all  others  will  be  forwarded  to  the  parent  class. 

The  above  discussion  was  based  on  the  work  of  Shapiro  and  Takeuchi  [ST83],  Clark 
[Cla87]  and  Davison  [Dav87]. 

11      Deriving  Multiple  Solutions 


Committed-choice  parallel  languages  appear  to  exploit  all  possible  AND-parallelism,  but  they 
violate  the  completeness  rule  of  Prolog  since  only  a  single  clause  commits  to  a  goal.  This 
seems  to  be  a  serious  limitation  which  is  likely  to  persist,  at  least  until  methods  for  elegantly 
combining  stream  AND-parallelism  with  OR-parallel  exploration  of  multiple  solutions  have 
been  clearly  identified.  We  briefly  described,  however,  two  attempts  in  the  preceding  sections: 
a  practical  solution,  the  set  predicate  in  Parlog  and  a  more  ambitious  one,  the  double-neck 
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subset  of  P-Prolog.  While  the  main  limitation  of  the  latter  appears  to  be  the  implementation 
overhead,  the  former  is  restricted  by  its  AND-sequential  nature.  Furthermore,  an  interpreted 
evaluation  of  the  set  predicate  based  on  repeated  clause  copying,  such  as  the  one  proposed 
by  Clark  and  Gregory  [CG85]  will  place  an  additional  run-time  overhead. 

Recent  investigation  into  this  direction  by  Ueda  and  Tamaki  from  Japan  has  brought  up 
some  interesting  results.  If  the  I/O  pattern  of  use  for  each  predicate  is  known  and  if  every 
predicate  is  always  called  with  ground  input  terms  and  returns  with  ground  output  terms,  the 
evaluation  of  set  can  be  statically  rewritten  into  a  committed-choice  program  which  yields  the 
same  behavior  as  the  corresponding  multiple  solution  one.  The  first  algorithm  for  this  source- 
to-source  translation,  described  by  Ueda  [Ued87b],  was  based  on  the  continuation  model  of 
Prolog  and  was  intended  for  application  in  GHC.  Transformed  programs  exhibited  comparable 
or  even  better  performance  than  corresponding  Prolog  programs  using  the  extralogical  bagof 
predicate.  Figure  37  shows  the  GHC  clauses  for  a  parser  of  a  miniature  natural  language 
and  its  translation  for  exhaustive  search.  Notice  that  det,  noun  and  verb  predicates  (the 
dictionary)  are  not  expanded  since  they  are  deterministic  by  nature.  Exhaustive  search  is 
useful  not  only  for  deriving  multiple  solutions  but  also  for  dealing  with  predicates  whose  guards 
are  not  capable  of  selecting  the  appropriate  clause,  as  illustrated  by  the  example.  The  main 
disadvantage  of  continuation-based  methods  is  that  they  impose  strict  AND-sequentiality. 

Trying  to  break  away  from  such  sequentiality  Tamaki  [Tam87]  based  his  transformation 
method  on  the  stream  model  of  parallel  logic  languages.  Output  variables  are  translated  into 
streams  of  values  which  are  subsequently  fed  in  a  parallel  fashion  to  their  consumers.  Special 
interface  predicates  are  devised  for  this  purpose.  The  method  essentially  transforms  the  non- 
deterministic  assertional  representation  of  data  into  a  deterministic  structured  term  represen- 
tation. Although  the  stream-based  approach  bears  a  lot  of  similarities  with  the  continuation- 
based  one,  it  manages  to  exploit  restricted  AND-parallelism  similar  to  that  of  PEPSys  Prolog. 
Independently,  Ueda  [Ued87a]  extended  his  continuation  scheme  to  include  subcontinuntions 
in  an  attempt  to  incorporate  stream-parallelism  into  his  model.  The  resulting  method  can 
simulate  the  coroutine  mode  of  execution,  very  useful  in  the  context  of  generate-and-test  prob- 
lems as  seen  in  section  3.  An  even  more  promising  solution  seems  to  be  a  combination  of  all 
three  approaches  based  on  information  derived  by  a  static  program  analysis. 

From  a  totally  different  origin  is  the  work  of  Bansal  and  Sterling  on  the  subject  [BS87].  In 
their  attempt  to  translate  regular  Prolog  programs  into  FCP,  they  came  up  with  an  enumerate- 
&nd-filter  paradigm  for  simulating  the  equivalent  of  a  parallel  backtracking  mechanism.  All 
possible  solutions  are  enumerated  using  an  all  solutions  predicate  and  for  each  new  solution 
a  "filter"  process  is  spawned,  which  determines  whether  it  should  be  retained  in  the  final 
solution  list  or  not.  Both  enumeration  and  filter  routines  are  driven  by  interpreters  allowing 
for  a  straightforward  implementation  at  the  expense  of  a  run-time  overhead.  OR-parallelism 
is  obtained,  as  usual,  by  transformation  into  AND-parallelism.  AND-  and  stream  parallelism 
can  be  derived  as  a  result  of  the  parallelism  between  enumerating  and  filtering  processes. 
Techniques  that  will  reduce  some  of  the  run-time  overhead  by  automated  source-to-source 
transformations  and  partial  evaluation  are  topics  of  current  research. 
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Single  Solution  Version 

b(X,Z,S)  :-  true  |  np{NP,X,  Y),  vp(  VP,  Y,Z),  S  =  s(NP,  VP). 

np(X, Z.NP)  :-  det( X,Y,D),  noun{Y,Z,N)  \  NP  =  np(D,N). 

vp(X,Z,  VP)  :-verb(X,Z,V)  \  VP  =  vp(V). 

vp(X,Z,  VP)  :-  verb(X,  Y,  V),  np(Y,Z,NP)  \  VP  =  vp(V,NP). 

det([the\X],Z,D)  :-  true  \  Z  =  X,  D  =  det(the). 
noun([man\X],Z,N)  :-  true  |  Z  =  X,  N  —  noun(man). 
\eTb([walks\X],Z,  V)  :-  true  |  Z  =  X,  V  =  verb(walks). 

Exhaustive  Search  Version 

s(X,Cont,SO,SJ)  :-  true  |  np(X, ll(Coni),SO,Sl). 
6(_.,S0,S1)  :-  true  |  SO  =  SI. 

np{X,Cont,SO,Sl)  :-  det(A',  Y,D),  noun(Y,Z,N)  |  contnp(Cont,np(D,N),Z,SO,Sl). 

np(.,_, SO, SI)  :-  true  |  SO  =  SI. 

vp(X,Cont,SO,Sl)  :-  true  |  \pl(X, Cont,S0,S2),  \p2(X, Coni,SS,Sl). 
\pl(X, Cont,S0,Sl)  :-  verb(A',Z,  V)  |  contvp( Coni,vp(V),Z,SO,Sl). 
\p\\.,.,S0,Sl)  :-  true  |  SO  =  SI. 

vp2(X, Coni,S0,Sl)  :-  vetb{X,Y,V)  \  np{Y,l3(V,Conl),S0,Sl). 
\p2(.,.,S0,Sl)  :-  true  |  SO  -  SI. 

conts(W,S,Z,SO,Sl)  :-  true  |  SO  =  [(S,Z)\Sl]. 

contnp(ll(Cont),NP,Y,SO,Sl)  :- true  |  vp{Y,12(NP,Cont),S0,Sl). 
contnp(lS(V,Cont),NP,Z,SO,Sl)  :-  true  |  contnp(Cont,vp(V,NP),Z,SO,Sl). 

contvp(l2(NP,Coni),VP,Z,SO,Sl)  :-  true  |  conts(Coni,s(NP,VP),Z,SO,Sl). 

:-  s([the, man, walks], 10,  Tree,[]). 

Figure  37:  Translating  exhaustive  search  into  GHC  using  continuations. 

12      Conclusion  and  Future  Directions 

The  purpose  of  this  paper  was  twofold:  on  the  one  hand  it  attempted  to  present  the  main 
members  of  the  parallel  logic  programming  languages  family;  on  the  other,  to  include  some 
characteristic  programming  examples,  techniques  and  paradigms  of  parallel  logic  programming 
using  these  languages.  The  presentation  of  the  various  languages  kept  a  strict  chronological 
sequence  to  better  illustrate  the  mutual  influence  among  the  proposals.  The  diagram  in 
figure  38  summarizes  the  influences  by  sketching  the  family  tree  of  the  languages  discussed  in 
this  paper.  Although  there  seems  to  be  a  large  variety  of  languages,  the  various  members  are 
close  to  one  another  and,  more  importantly,  share  most  of  the  programming  techniques.  In  this 
sense,  the  various  research  forces  spread  around  the  world  promoting  the  various  Relational 
Language  descendants  can  be  actually  thought  of  as  a  common  joint  effort  to  establish  a 
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Figure  38:  The  parallel  logic  programming  languages  family  tree. 


universal  parallel  logic  language  with  well-understood  semantics,  efficient  implementation  and 
the  capability  of  providing  a  rich  variety  of  parallel  programming  techniques. 

Before  an  acceptable  solution  can  be  reached,  several  problems,  many  of  which  have 
already  been  identified,  need  to  be  solved.  The  efficiency  of  the  languages  on  an  actual 
parallel  machine  is  perhaps  at  top  priority.  So  far,  experimental  implementations  have  not 
come  close  to  a  performance  level  that  would  justify  logic  programming  as  the  recommended 
approach  to  exploit  parallel  machines.  Extensive  compiler  optimizations  should  be  devised  and 
successfully  applied,  if  a  step  ahead  in  this  direction  is  to  be  made.  But  the  success  of  this 
approach  is  directly  dependent  on  the  simplicity  of  the  languages  themselves.  Optimizations 
can  only  perform  as  claimed  on  conceptually  simple  languages,  free  of  impurities  and  side- 
effects.  The  same  requirements  are  posed  if  a  formal  declarative  semantics  is  to  be  obtained. 
This  theoretical  foundation  will,  in  turn,  facilitate  program  transformation  techniques,  such 
as  partial  evaluation  or  source-to-source  transformations  which  appear  indispensable  when 
trying  to  maintain  expressiveness  and  simplicity  without  sacrificing  efficiency. 

Diversity  of  applications  is  also  very  important  at  this  stage  of  language  development. 
Apart  from  indicating  the  strong  and  weak  points  of  the  logic  programming  approach  to 
parallel  programming,  they  will  also  help  shape  the  language  proposals  themselves.  Various 
language  deficiencies  and  lack  of  useful  constructs  will  be  pointed  out  at  the  early  stages 
of  language  definition  with  minor  negative  influence  to  the  rest  of  the  development  cycle. 
Moreover,  extensive  programming  will  bring  up  interesting  programming  techniques  as  it 
already  did  in  the  past. 
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Finally,  the  user  interface  requires  improvement  for  speeding  up  program  development  in 
a  parallel  language.  This  will  be  particularly  helpful  because  of  the  multiple  programming 
styles  offered  by  a  parallel  logic  language,  namely  object  oriented,  process  oriented,  stream 
based,  metalevel  and  so  on.  Moreover,  some  features,  such  as  mutable  arrays,  multiway 
stream  merging  and  multiple  solution  derivation  can  be  achieved  in  these  languages  but  in 
an  awkward  manner.  Some  kind  of  macro-language  on  top  of  the  regular  language  may  help 
provide  a  reasonable  interface  or  even  assist  in  the  implementation. 

Although  logic  programming  may  not  be  the  ultimate  answer  to  the  diverse  needs  of 
parallel  programming,  it  does  come  closer  than  many  other  alternatives.  Further  investigation 
in  the  directions  outlined  above  will  provide  more  insight  on  how  far  it  can  reach  and  how  far 
there  is  still  to  go. 
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